git clone barerepo.git srcdir
ikiwiki --rcs=git srcdir destdir
-I've fixed this initial construction case, and, based on my testing, I've also fixed the post-update executing on a new master, and ikiwiki.cgi executing on a non-existent master cases. Unfortunately these cases still generate a lot of warnings from recentchanges; I suspect a much more invasive patch would be needed to shut those up.
+I've fixed this initial construction case, and, based on my testing, I've
+also fixed the post-update executing on a new master, and ikiwiki.cgi
+executing on a non-existent master cases.
Please commit so my users stop whining at me about having clean branches to push to, the big babies.
-Summary: Change three scary loud failure cases related to empty branches into three scary loud success cases.
+Summary: Change three scary loud failure cases related to empty branches into three mostly quiet success cases.
+
+[[!tag patch]]
+
+> FWIW, [[The_TOVA_Company]] apparently wants this feature (and I hope
+> I don't mind that I mention they were willing to pay someone for it,
+> but I told them I'd not done any of the work. :) )
+>
+> Code review follows, per hunk.. --[[Joey]]
+
<pre>
diff --git a/IkiWiki/Plugin/git.pm b/IkiWiki/Plugin/git.pm
-index cf7fbe9..5b1e334 100644
+index cf7fbe9..e5bafcf 100644
--- a/IkiWiki/Plugin/git.pm
+++ b/IkiWiki/Plugin/git.pm
-@@ -439,10 +439,13 @@ sub git_commit_info ($;$) {
+@@ -439,17 +439,21 @@ sub git_commit_info ($;$) {
my @opts;
push @opts, "--max-count=$num" if defined $num;
-+ my @raw_lines;
-
+-
- my @raw_lines = run_or_die('git', 'log', @opts,
- '--pretty=raw', '--raw', '--abbrev=40', '--always', '-c',
- '-r', $sha1, '--', '.');
-+ if (-e $config{srcdir} . "/.git/refs/heads/" . $config{gitmaster_branch}) {
-+ @raw_lines = run_or_die('git', 'log', @opts,
-+ '--pretty=raw', '--raw', '--abbrev=40', '--always', '-c',
-+ '-r', $sha1, '--', '.');
-+ };
-
+-
++ my @raw_lines;
my @ci;
- while (my $parsed = parse_diff_tree(\@raw_lines)) {
-@@ -474,7 +477,10 @@ sub rcs_update () {
+- while (my $parsed = parse_diff_tree(\@raw_lines)) {
+- push @ci, $parsed;
+- }
++
++ # Test to see if branch actually exists yet.
++ if (run_or_non('git', 'show-ref', '--quiet', '--verify', '--', 'refs/heads/' . $config{gitmaster_branch}) ) {
++ @raw_lines = run_or_die('git', 'log', @opts,
++ '--pretty=raw', '--raw', '--abbrev=40', '--always', '-c',
++ '-r', $sha1, '--', '.');
++
++ while (my $parsed = parse_diff_tree(\@raw_lines)) {
++ push @ci, $parsed;
++ }
+
+- warn "Cannot parse commit info for '$sha1' commit" if !@ci;
++ warn "Cannot parse commit info for '$sha1' commit" if !@ci;
++ };
+
+ return wantarray ? @ci : $ci[0];
+ }
+</pre>
+
+My concern is that this adds a bit of slowdown (git show-ref is fast, but
+It's still extra work) to a very hot code path that is run to eg,
+update recentchanges after every change.
+
+Seems not ideal to do extra work every time to handle a case
+that will literally happen a maximum of once in the entire lifecycle of a
+wiki (and zero times more typically, since the setup automator puts in a
+.gitignore file that works around this problem).
+
+So as to not just say "no" ... what if it always tried to run git log,
+and if it failed (or returned no parsed lines), then it could look
+at git show-ref to deduce whether to throw an error or not.
+--[[Joey]]
+
+> Ah, but then git-log would still complain "bad revision 'HEAD'"
+> --[[Joey]]
+
+ jrayhawk@piny:/srv/git/jrayhawk.git$ time perl -e 'for( $i = 1; $i < 10000; $i++) { system("git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"); }'
+
+ real 0m10.988s
+ user 0m0.120s
+ sys 0m1.210s
+
+> FWIW, "an extra millisecond per edit" vs "full git coverage" is no
+> contest for me; I use that patch on seven different systems, including
+> freedesktop.org, because I've spent more time explaining to users either
+> why Ikiwiki won't work on their empty repositories or why their
+> repositories need useless initial commits (a la Branchable) that make
+> pushing not work and why denyNonFastForwards=0 and git push -f are
+> necessary than all the milliseconds that could've been saved in the
+> world.
+>
+> But, since we're having fun rearranging deck chairs on the RMS Perl
+> (toot toot)...
+>
+> There's some discrepency here I wasn't expecting:
+
+ jrayhawk@piny:/srv/git/jrayhawk.git$ time dash -c 'i=0; while [ $i -lt 10000 ]; do i=$((i+1)); git show-ref --quiet --verify -- refs/heads/master; done'
+
+ real 0m9.986s
+ user 0m0.170s
+ sys 0m0.940s
+
+> While looking around in the straces, I notice Perl, unlike {b,d}ash
+> appears to do PATH lookup on every invocation of git, adding up to
+> around 110 microseconds apiece on a post-2.6.38 16-thread QPI system:
+
+ 29699 0.000112 execve("/home/jrayhawk/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = -1 ENOENT (No such file or directory)
+ 29699 0.000116 execve("/usr/local/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = -1 ENOENT (No such file or directory)
+ 29699 0.000084 execve("/usr/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = 0
+
+> You can probably save a reasonable number of context switches and
+> RCU-heavy (or, previously, lock-heavy) dentry lookups by doing a Perl
+> equivalent of `which git` and using the result. It might even add up to
+> a whole millisecond in some circumstances!
+>
+> No idea where the rest of that time is going. Probably cache misses
+> on the giant Perl runtime or something.
+>
+> ...
+>
+> Now I feel dirty for having spent more time talking about optimization
+> than that optimization is likely to save. This must be what being an
+> engineer feels like.
+> --jrayhawk
+
+<pre>
+@@ -474,7 +478,10 @@ sub rcs_update () {
# Update working directory.
if (length $config{gitorigin_branch}) {
- run_or_cry('git', 'pull', '--prune', $config{gitorigin_branch});
-+ run_or_cry('git', 'fetch', '--prune', $config{gitorigin_branch});
-+ if (-e $config{srcdir} . '/.git/refs/remotes/' . $config{gitorigin_branch} . '/' . $config{gitmaster_branch}) {
-+ run_or_cry('git', 'merge', $config{gitorigin_branch} . '/' . $config{gitmaster_branch});
-+ }
++ run_or_cry('git', 'fetch', '--prune', $config{gitorigin_branch});
++ if (run_or_non('git', 'show-ref', '--quiet', '--verify', '--', 'refs/remotes/' . $config{gitorigin_branch} . '/' . $config{gitmaster_branch}) ) {
++ run_or_cry('git', 'merge', $config{gitorigin_branch} . '/' . $config{gitmaster_branch});
++ }
}
}
-
-@@ -559,7 +565,7 @@ sub rcs_commit_helper (@) {
+
+</pre>
+
+Same concern here about extra work. Code path is nearly as hot, being
+called on every refresh. Probably could be dealt with similarly as above.
+
+Also, is there any point in breaking the pull up into a
+fetch followed by a merge? --[[Joey]]
+
+> The same benchmarking applies, at least.
+>
+> Re: fetch/merge: We can't test for the nonexistence of the origin branch
+> without fetching it, and we can't merge it if it is, indeed,
+> nonexistant.
+>
+> Unless you're implying that it would be better to just spam stderr with
+> unnecessary scary messages and/or ignore/suppress them and lose the
+> ability to respond appropriately to every other error condition. As
+> maintainer, you deal with a disproportionate amount of the resulting
+> support fallout, so I'm perfectly satisfied letting you make that call.
+> --jrayhawk
+
+<pre>
+@@ -559,7 +566,7 @@ sub rcs_commit_helper (@) {
# So we should ignore its exit status (hence run_or_non).
if (run_or_non('git', 'commit', '-m', $params{message}, '-q', @opts)) {
if (length $config{gitorigin_branch}) {
+ run_or_cry('git', 'push', $config{gitorigin_branch}, $config{gitmaster_branch});
}
}
+
</pre>
+
+This seems fine to apply. --[[Joey]]
+
+> Hooray!
+> --jrayhawk