X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/9ef8c8c47aec5f2218b42697760167b0eba6f8a3..c142dba356b757facd6684a99623c58430b7221e:/doc/todo/git-annex_support.mdwn diff --git a/doc/todo/git-annex_support.mdwn b/doc/todo/git-annex_support.mdwn index 9ebd11e06..c49cab4d2 100644 --- a/doc/todo/git-annex_support.mdwn +++ b/doc/todo/git-annex_support.mdwn @@ -9,3 +9,236 @@ Another solution would be to make ikiwiki a remote itself and allow users to pus > direct mode, but I'm not sure that's really desirable here). > I'd like to make symlinks possible without compromising security, > but it'll be necessary to be quite careful. --[[smcv]] + +First implementation +==================== + +So as the [[discussion]] shows, it seems it's perfectly possible to actually do this! There's this [gallery site](http://stockholm.kalleswork.net) which uses the [[plugins/contrib/album]] plugin and git-annex to manage its files. + +The crucial steps are: + + 1. setup a git annex remote in `$srcdir` + + 2. configure direct mode because ikiwiki ignores symlinks for [[security]] reasons: + + cd $srcdir + git annex init + git annex direct + + 3. configure files to be considered by git-annex (those will be not committed into git directly): + + git config annex.largefiles 'largerthan=100kb and not (include=*.mdwn or include=*.txt)' + + 4. make the bare repository (the remote of `$srcdir`) ignored by git-annex: + + cd $srcdir + git config remote.origin.annex-ignore true + git config remote.origin.annex-sync false + + (!) This needs to be done on *ANY* clone of the repository, which is annoying, but it's important because we don't want to see git-annex stuff in the bare repo. (why?) + + 5. deploy the following crappy plugin to make commits work again and make sure the right files are added in git-annex: + +[[!format perl """ +#!/usr/bin/perl +package IkiWiki::Plugin::gitannex; + +use warnings; +use strict; +use IkiWiki 3.00; + +sub import { + hook(type => "getsetup", id => "gitannex", call => \&getsetup); + hook(type => "savestate", id => "gitannex", call => \&rcs_commit); + # we need to handle all rcs commands maybe? +} + +sub getsetup () { + return + plugin => { + safe => 1, # rcs plugin + rebuild => undef, + section => "misc", + }, +} + +# XXX: we want to copy or reuse safe_git + +sub rcs_commit (@) { + chdir $config{srcdir}; + `git annex add --auto`; + `git annex sync`; +} + +sub rcs_commit_staged (@) { + rcs_commit($@); +} + +1 +"""]] +This assumes you know what `srcdir`, `repository` and so on mean, if you forgot (like me), see this reference: [[rcs/git/]]. + + +What doesn't work +----------------- + + * the above plugin is kind of flaky and ugly. + * it's not an RCS plugin, but probably should be, replacing the git plugin, because really: git doesn't work at all anymore at this point + +What remains to be clarified +---------------------------- + + * how do files get pushed to the `$srcdir`? Only through the web interface? + * why do we ignore the bare repository? + +See the [[discussion]] for a followup on that. --[[anarcat]] + +Alternative implementation +========================== + +An alternative implementation, which remains to be detailed but is mentionned in [[forum/ikiwiki_and_big_files]], is to use the [[underlay]] feature combined with the `hardlink` option to deploy the git-annex'd files. Then git-annex is separate from the base ikiwiki git repo. See also [[tips/Ikiwiki_with_git-annex__44___the_album_and_the_underlay_plugins]] for an example. + +Also note that ikiwiki-hosting has a [patch waiting](https://ikiwiki-hosting.branchable.com/todo/git-annex_support) to allow pushes to work with git-annex. This could potentially be expanded to sync content to the final checkout properly, avoiding some of the problems above (esp. wrt to non-annex bare repos). + +Combined with the [[underlay]] feature, this could work very nicely indeed... --[[anarcat]] + +Here's an attempt: + +
+cd /home/user +git clone source.git source.annex +cd source.annex +git annex direct +cd ../source.git +git annex group . transfer +git remote add annex ../source.annex +git annex sync annex ++ +Make sure the `hardlink` setting is enabled, and add the annex as an underlay, in `ikiwiki.setup`: + +
+hardlink: 1 +add_underlays: +- /home/w-anarcat/source.annex ++ +Then moving files to the underlay is as simple as running this command in the bare repo: + +
+#!/bin/sh + +echo "moving big files to annex repository..." +git annex move --to annex ++ +I have added this as a hook in `$HOME/source.git/hooks/post-receive` (don't forget to `chmod +x`). + +The problem with the above is that the underlay wouldn't work: for some reason it wouldn't copy those files in place properly. Maybe it's freaking out because it's a full copy of the repo... My solution was to make the source repository itself a direct repo, and then add it as a remote to the bare repo. --[[anarcat]] + +Back from the top +================= + +Obviously, the final approach of making the `source` repository direct mode will fail because ikiwiki will try to commit files there from the web interface which will fail (at best) and (at worst) add big files into git-annex (or vice-versa, not sure what's worse actually). + +Also, I don't know how others here made the underlay work, but it didn't work for me. I think it's because in the "source" repository, there are (dead) symlinks for the annexed files. This overrides the underlay, because of [[security]] - although I am unclear as to why this is discarded so early. So in order to make the original idea above work properly (ie. having a separate git-annex repo in direct mode) work, we must coerce ikiwiki into tolerating symlinks in the srcdir a little more: + +
+diff --git a/IkiWiki.pm b/IkiWiki.pm +index 1043ef4..949273c 100644 +--- a/IkiWiki.pm ++++ b/IkiWiki.pm +@@ -916,11 +916,10 @@ sub srcfile_stat { + my $file=shift; + my $nothrow=shift; + +- return "$config{srcdir}/$file", stat(_) if -e "$config{srcdir}/$file"; +- foreach my $dir (@{$config{underlaydirs}}, $config{underlaydir}) { +- return "$dir/$file", stat(_) if -e "$dir/$file"; ++ foreach my $dir ($config{srcdir}, @{$config{underlaydirs}}, $config{underlaydir}) { ++ return "$dir/$file", stat(_) if (-e "$dir/$file" && ! -l "$dir/$file"); + } +- error("internal error: $file cannot be found in $config{srcdir} or underlay") unless $nothrow; ++ error("internal error: $file cannot be found in $config{srcdir} or underlays @{$config{underlaydirs}} $config{underlaydir}") unless $nothrow; + return; + } + +diff --git a/IkiWiki/Render.pm b/IkiWiki/Render.pm +index 9d6f636..e0b4cf8 100644 +--- a/IkiWiki/Render.pm ++++ b/IkiWiki/Render.pm +@@ -337,7 +337,7 @@ sub find_src_files (;$$$) { + + if ($underlay) { + # avoid underlaydir override attacks; see security.mdwn +- if (! -l "$abssrcdir/$f" && ! -e _) { ++ if (1 || ! -l "$abssrcdir/$f" && ! -e _) { + if (! $pages{$page}) { + push @files, $f; + push @IkiWiki::underlayfiles, $f; ++ +
+../IkiWiki/Render.pm:421: find_src_files(1, \@files, \%pages); +../IkiWiki/Render.pm:846: ($files, $pages)=find_src_files(); +../po/po2wiki:18:my ($files, $pages)=IkiWiki::find_src_files(); ++ +The first occurence is in `IkiWiki::Render::process_changed_files`, where it is used mostly for populating `@IkiWiki::underlayfiles`, the only side effect of +`find_src_files`. The second occurence is in `IkiWiki::Render::refresh`. There things are a little more complicated (to say the least) and a lot of stuff happens. To put it in broad terms, first it does a `IkiWiki::Render::scan` and then a `IkiWiki::Render::render`. The last two call `srcfile()` appropriately (where i put an extra symlink check), except for `will_render()` in `scan`, which I can't figure out right now and that seems to have a lot of global side effects. It still looks fairly safe at first glance. The `rcs_get_current_rev`, `refresh`, `scan` and `rendered` hooks are also called in there, but I assume those to be safe, since they are called with sanitized values already. + +The patch does work: the files get picked up from the underlay and properly hardlinked into the target `public_html` directory! So with the above patch, then the following hook in `source.git/hooks/post-receive`: + +
+#!/bin/sh + +OLD_GIT_DIR="$GIT_DIR" +unset GIT_DIR +echo "moving big files to annex repository..." +git annex copy --to annex +git annex sync annex ++ +(I am not sure anymore why GIT_DIR is necessary, but I remember it destroyed all files in my repo because git-annex synced against the `setup` branch in the parent directory. fun times.) + +Then the `annex` repo is just a direct clone of the source.git: + +
+cd /home/user +git clone --shared source.git annex +cd annex +git annex direct +cd ../source.git +git remote add annex ../annex ++ +And we need the following config: + +
+hardlink: 1 +add_underlays: +- /home/w-anarcat/annex +add_plugins: +- underlay ++ +... and the `ikiwiki-hosting` patch mentionned earlier to allow git-annex-shell to run at all. Also, the `--shared` option will [make git-annex use hardlinks itself between the two repos](https://git-annex.branchable.com/todo/wishlist:_use_hardlinks_for_local_clones/), so the files will be available for download as well. --[[anarcat]] + +>