Ikiwiki to Hugo conversion notes
Why
too slow: ikiwiki takes 30 seconds to refresh even when changing a single page
hard to maintain: my patches to ikiwiki are still not merged and it makes upgrades painful
hard to deploy: it's difficult to tell people to use ikiwiki because it's really hard to install and deploy a new wiki... you need to install the Debian package, then create a git repo (or SVN? or darcs! why not), then create a
.setup
file, then... I forgot! I had to useikiwiki-hosting
to make my life easier and that just adds another layer of complexity.unusual templating engine: Perl's templates may have been great at some point, but they are definitely showing their age now. something more standard like Jinja or Golang templates would be easier for designers to use
sometimes strange markup rules. just writing this document was a challenge, because preformatted markdown text (prefixed with four spaces) is being interpreted by the wikilinks parser, which lead to errors like:
$ git grep -h '[[!' | sed 's/[[!/\n[[!/g' | grep '[[!' | sed 's/ .*//' | sort | uniq -c | sort -n 1 [[!bibtex2html <span class="error">Erreur: cannot find bestlink for "<span"</span>]] ^ false positive, in software/ikiwiki-osm 1 <a class="toggle" href="#services-wiki-ikiwiki-hugo-conversion.default">more</a> ^ IMPORTANT, need to figure it out 18 [[!img <span class="error">Erreur: bad image filename</span>]], which have their own unique logic:
I had to use the
format
directive to workaround those problems.
First conversion attempt
I had to rename all files, and move stuff into content/
, then things
started generally working "working" (as in "breaking").
I had to clone a theme, the quickstart suggest:
git submodule add https://github.com/budparr/gohugo-theme-ananke.git themes/ananke
Then I had a failure to parse comments:
Error: Error building site: "/home/anarcat/wikis/anarc.at/content/blog/2013-02-04-why-i-dont-pulseaudio.md:2:1": starting HTML comment with no end
Workaround, delete all comments:
505 2019-07-18 17:22:08 grep -l -r -- '<!--' * | grep -e comment -e '\.md$' | xargs sed -i '/<!--/d'
Long term solution might be to convert to shortcodes.
I also tried:
607 2019-07-18 17:11:55 grep -l -r -- '-->$' * | grep -e comment -e '\.md$'
608 2019-07-18 17:12:15 grep -l -r -- '-->$' * | grep -e comment -e '\.md$' -0 | xargs -0 sed -i 's/-->$/!-->/'
609 2019-07-18 17:12:23 grep -l -r -- '-->$' * | grep -e comment -e '\.md$' | xargs sed -i 's/-->$/!-->/'
Another failure is when it finds an HTML file with an unquoted href
argument (e.g. hardware/phone/htc-one-s/apps.html
).
Hugo ultra short primer
apt install hugo
- available in Debian, also there's a newer version in unstablehugo
builds stuffhugo serve
does that and serves on localhost, with autoreloadhugo new
to post new stuff
Maybe the Emacs mode could be useful.
Results
Result of running hugo build after the renames:
| EN
+------------------+-----+
Pages | 734
Paginator pages | 63
Non-page files | 549
Static files | 3
Processed images | 0
Aliases | 12
Sitemaps | 1
Cleaned | 0
Things generally look like crap:
- ikiwiki-specific links are not parsed
- no directives are parsed, so most content is broken
- links are broken
- blog posts are not sorted properly and generally look like crap as well
Inventory
List of directives used in my wiki:
$ git grep -h '\[\[!' | sed 's/\[\[!/\n[[!/g' | grep '\[\[!' | sed 's/ .*//' | sort | uniq -c | sort -n
1 [[!bibtex2html <span class="error">Erreur: cannot find bestlink for "^"</span>]]
^ false positive, in software/ikiwiki-osm
1 <a class="toggle" href="#services-wiki-ikiwiki-hugo-conversion.default">more</a>
^ IMPORTANT, need to figure it out
18 [[!img
^ IMPORTANT, need to figure it out
22 [[!color
^ services table, rebuild by hand
26 [[!iki
^ shortcodes?
50 [[!format
^ IMPORTANT, need to figure it out
55 [[!shortcut
^ shortcode, false positive (in shortcuts)
72 [[!wikipedia
^ shortcode
96 [[!toc
^ IMPORTANT, need to figure it out (see aboev)
109 [[!debcve
^ shortcode
115 [[!debbug
^ shortcode
142 [[!debpkg
^ shortcode
268 [[!inline
mostly used in frontpage and blog, need to figure out
335 [[!tag
^ IMPORTANT, need to figure it out
358 [[!comment
^ IMPORTANT, need to figure it out
1254 [[!meta
^ IMPORTANT, need to figure it out
Magic links
There are also a ton of "magic ikiwiki links", called wikilink, which have their own unique logic:
$ git grep -h '\[\[[^!]' | sed 's/\[\[/\n[[/g' | grep '\[\[[^!]' | wc -l
631
Those will be difficult to convert as the semantics of internal linking in Markdown is not well defined. Or rather, it's bound to HTML (in general) and ikiwiki goes beyond that. Some research needs to be done to see how other engines handle this and how it compares to the linkingrules.
The peculiarities of wikilinks in ikiwiki:
case-insensitiven (e.g.
[[OtherPage]]
and[[otherpage]]
both work) - implementedsubpage lookups (e.g.
[[otherpage]]
infoo/subpage
will look forfoo/subpage/otherpage
,foo/otherpage
,otherpage
, in order;[[foo/subpage]]
will find/foo/subpage
frombar
, instead of the expectedbar/foo/subpage
in HTML) - implementedabsolute lookups (prefixed with
/
, e.g.[[/about]]
links tohttps://example.com/foo/about
if the wiki is inexample.com/foo
, and nothttps://example.com/about
as HTML normally would - probably relevant only for wikis in subdirectories) - NOT IMPLEMENTEDuserdir lookups (
[[anarcat]]
links to[[users/anarcat]]
if userdir is set tousers
) in some contexts (namely comments, recentchanges, but not normal content) - NOT IMPLEMENTEDbackslash escapes (
\[[WikiLink]]
is not a link) - implemented by the caller (in LINK_RE)anchor lookups (
[[WikiLink#foo]]
) - implemented by the caller (in LINK_RE)there might be other rules like underscore (
_
) mapping to spaces and other funky escape mechanisms - NOT IMPLEMENTED, look at IkiWiki::titlepage for those
Tasks
the gist of it is we need to implement:
- meta (in progress)
- foo/ and foo.mdwn rename to foo/_index.mdwn (see also page bundles and content organization)
[[link]]
and[[link|parser]]
, hard because we need to figure out pagespec? maybe links and crossferences could save us, or maybe just relative URLs - implemented some of that logic in the parser- incidentally, backslashed stuff like the above link stuff for example
- table of contents could be a problem: Hugo only has support
through templates, not markup (or maybe shortcode would
work?) - implemented a directive parser that converts to GitLab's
[[__TOC__]]
which might be reused) - img directives (maybe this works
- format (shortcodes? or syntax hilighting)
- shortcodes (dokuwiki converter also suggests using shortcodes for interwiki)
- admonitions (same as shortcode?)
- switch to a branch before making changes?
structural elements needing more thinking:
- RSS
- frontpage and blog structure (
inline
) - same with
map
andorphan
pages - comments
- tags (AKA taxonomies in Hugo parlance)
- 550 non-page files?
- git-annex stuff
- a good theme
- sidebar (maybe see sections)
- blog posts outside of
blog/
- search
will be converted by hand:
- services table (color)
- bibtex
- toggle in blog
- pagestats in tags and monthly reports tagr
- openid.mdwn redirect
meta
redirections
Work is ongoing in this conversion script.
Comments
Comments are a particular beast that desserves its own section. Here are the solutions I found online:
Prjoect | Backend | Notes |
---|---|---|
Commento++ | Go + JS | not in Debian, but lots of features |
Discourse | N/A | used by the Discourse founder for his comments |
Isso | Python + JS | drop-in replacement for Disqus, used by researchhut |
Mastodon | N/A | JAK wrote a Mastodon comment server, or embed |
Remark42 | Go + JS | not sure... unusual database format |
Hashover | PHP JS |
Feature comparison:
Project | Features |
---|---|
Commento++ | markdown, Disqus import, voting, spam detection, sticky comments, thread locking, Akismet, email notifications, Social login (Google, GitHub, GitLab, Twitter), SSO, hosted version, not in debian PostgreSQL backend, 11KB client |
Coral | SSO via JWT, Social (Google, Facebook, OpenID)/email auth, custom CSS, email notifications, comment count, GDPR compliance, Slack integration |
Isso | markdown, Disqus import, voting, web or email-based moderation, email notifications, Debian package removed, rate-limiting, RSS, i18n, sqlite backend, 40KB client |
Remark42 | markdown, Disqus import, social auth, moderation, voting, pinning, image uploads, RSS, bolt backend, email/telegram notifications |
Talkyard | mardown, moderation, voting, Q&A, anonymous comments, chat, hosted version |
Hashover | threads, likes, popularity, avatars, admin interface, templates, plugins, sorting, spam filter, email notifications, RSS feeds, XML/JSON or SQL backend |
Discarded alternatives:
Platform | Why |
---|---|
Disqus | common, proprietary spyware, to be avoided |
Facebook comments | same, thoguh maybe less common |
CMSes (WP, Drupal, etc) | I want a static website |
JustComments | opencore model, no moderation or spam control in free version, dead site |
Talk/Coral | a "new commenting experience" designed for "newsrooms" |
Hypothes.is | annotation system |
Caliopen | "not ready for prod", interface for proprietary platforms like Facebook/Twitter |
Utterances | github issues commenting system |
Giscus | same |
Staticman | assumes a GitHub backend |
Talkyard | Scala? seems way overkill for a simple blog |
Other ideas:
- bridgy "connects websites to social media", including Mastodon
- email gateway - hugo plugin based on jekyll-static-comments which takes comments by email and adds them inside the page, kind of like how ikiwiki comments work (except with email instead of CGI)
- cactus comments "federated comment system for the web, based on the Matrix protocol"
- https://sive.rs/shc - PostgreSQL NOTIFY and LISTEN and a Ruby backend generating a static HTML comments, bespoke
- comments as PRs
Other converters
Other alternatives
Consider alternative SSGs:
- 11ty: picked by mozilla, javascript
- lektor: used at Tor
- pelican: watch out for pelican, another user reports that, with caching, generating a 500 page site takes 30 seconds, 2 minutes without caching
- zola
See also those comparisons:
Inspiring themes:
Other ideas:
- soupault can do post-processing of the HTML rendered by any SSG, which might provide an interesting base to build what's missing from an alternative