X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/96936899da3037fa28f8be73003a14aa829878ee..d54950bd891eb3a2eeb3300e062ce7fe6482f020:/doc/todo/should_optimise_pagespecs.mdwn diff --git a/doc/todo/should_optimise_pagespecs.mdwn b/doc/todo/should_optimise_pagespecs.mdwn index f3048f328..728ab8994 100644 --- a/doc/todo/should_optimise_pagespecs.mdwn +++ b/doc/todo/should_optimise_pagespecs.mdwn @@ -79,8 +79,6 @@ I can think about reducung the size of my wiki source and making it available on > > --[[Joey]] -[[!template id=gitbranch branch=smcv/ready/optimize-depends author="[[smcv]]"]] - >> I've been looking at optimizing ikiwiki for a site using >> [[plugins/contrib/album]] (which produces a lot of pages) and it seems >> that checking which pages depend on which pages does take a significant @@ -90,6 +88,8 @@ I can think about reducung the size of my wiki source and making it available on >> rather than a single pagespec. This does turn out to be faster, although >> not as much as I'd like. --[[smcv]] +>>> [[Merged|done]] --[[smcv]] + >>> I just wanted to note that there is a whole long discussion of dependencies and pagespecs on the [[todo/tracking_bugs_with_dependencies]] page. -- [[Will]] >>>> Yeah, I had a look at that (as the only other mention of `pagespec_merge`). @@ -98,8 +98,6 @@ I can think about reducung the size of my wiki source and making it available on >>>> I haven't actually deleted it), because the "or" operation is now done in >>>> the Perl code, rather than by merging pagespecs and translating. --[[smcv]] -[[!template id=gitbranch branch=smcv/ready/remove-pagespec-merge author="[[smcv]]"]] - >>>>> I've now added a patch to the end of that branch that deletes >>>>> `pagespec_merge` almost entirely (we do need to keep a copy around, in >>>>> ikiwiki-transition, but that copy doesn't have to be optimal or support @@ -244,11 +242,13 @@ more complicated every time! > other things that overloaded the system. > b) Suggests to me we will probably want to force a rebuild on upgrade > when fixing this (via the mechanism in the postinst). -> -> BTW, the underlying bug here is really horribly simple: -> When refreshing, `loadindex` preserves the previous depends list, -> and `add_depends` adds stuff to it. So it doubles every time a page is -> re-rendered during refresh. --[[Joey]] +> +> I've investigated why the pagespecs keep growing: When page A changes, +> its old depends are cleared. Then +> page B that inlines A gets rebuilt, and its old depends are also cleared. +> But page B also inlines page C; which means C gets re-rendered. And this +> happens w/o its old depends being cleared, so C's depends are doubled. +> --[[Joey]] After the initial optimization: 14.27s to rebuild, 8.26/8.33/8.26 to refresh. Success! @@ -262,6 +262,9 @@ might well be experimental error, for that matter). > `add_depends` had no effect. So, the commit message to > b6fcb1cb0ef27e5a63184440675d465fad652acf is actually wrong.. ? --[[Joey]] +>> I'll try benchmarking again on the non-public wiki where I had the 4% +>> speedup. The docwiki is so small that 4% is hard to measure... --[[smcv]] + Not saving {depends} to the index, using a hash instead of a list to de-duplicate, and allowing add_depends to take an arrayref instead of a single pagespec had no noticable positive or negative effect on this test. @@ -269,11 +272,17 @@ pagespec had no noticable positive or negative effect on this test. > I see e4cd168ebedd95585290c97ff42234344bfed46c is still in your branch > though. I don't like using an arrayref, it could just take `($page, @depends)`. > and I don't see the need to keep it if it doesn't currently help. -> + +>> I'll drop it. --[[smcv]] + > Is there any reason to keep 7227c2debfeef94b35f7d81f42900aa01820caa3 > if it doesn't improve speed? > --[[Joey]] +>> I'll try benchmarking on a more complex wiki and see whether it has a +>> positive or negative effect. It does avoid being O(n**2) in number of +>> dependencies. --[[smcv]] + Memoizing the results of pagename brought the rebuild time down to 14.06s and the refresh time down to 7.96/7.92/7.92, a significant win. @@ -281,6 +290,9 @@ and the refresh time down to 7.96/7.92/7.92, a significant win. > called with a great many inputs.) Why did you chose to memoize it > explicitly rather than adding it to the memoize list at the top? +>> It does depend on global variables, so using Memoize seemed like asking for +>> trouble. I suppose what I did is equivalent to Memoize though... --[[smcv]] + Refactoring to use pagespec_match_list looks more risky from a code churn point of view; rebuild now takes 14.35s, but refresh is only 7.30/7.29/7.28, another significant win.