1 [[!template id=gitbranch branch=smcv/ready/depends-exact author="[[smcv]]"]]
3 I'm still trying to optimize ikiwiki for a site using
4 [[plugins/contrib/album]], and checking which pages depend on which pages
5 is still taking too long. Here's another go at fixing that, using [[Will]]'s
6 suggestion from [[todo/should_optimise_pagespecs]]:
8 > A hash, by itself, is not optimal because
9 > the dependency list holds two things: page names and page specs. The hash would
10 > work well for the page names, but you'll still need to iterate through the page specs.
11 > I was thinking of keeping a list and a hash. You use the list for pagespecs
12 > and the hash for individual page names. To make this work you need to adjust the
13 > API so it knows which you're adding. -- [[Will]]
15 If you have P pages and refresh after changing C of them, where an average
16 page has E dependencies on exact page names and D other dependencies, this
17 branch should drop the complexity of checking dependencies from
18 O(P * (D+E) * C) to O(C + P*E + P*D*C). Pages that use inline or map have
19 a large value for E (e.g. one per inlined page) and a small value for D (e.g.
24 Test 1: a wiki with about 3500 pages and 3500 photos, and a change that
25 touches about 350 pages and 350 photos
27 Test 2: the docwiki (about 700 objects not excluded by docwiki.setup, mostly
28 pages), docwiki.setup modified to turn off verbose, and a change that touches
29 the 98 pages of plugins/*.mdwn
31 In both tests I rebuilt the wiki with the target ikiwiki version, then touched
32 the appropriate pages and refreshed.
34 Results of test 1: without this branch it took around 5:45 to rebuild and
35 around 5:45 again to refresh (so rebuilding 10% of the pages, then deciding
36 that most of the remaining 90% didn't need to change, took about as long as
37 rebuilding everything). With this branch it took 5:47 to rebuild and 1:16
40 Results of test 2: rebuilding took 14.11s without, 13.96s with; refreshing
41 three times took 7.29/7.40/7.37s without, 6.62/6.56/6.63s with.
43 (This benchmarking was actually done with my [[plugins/contrib/album]] branch,
44 since that's what the huge wiki needs; that branch doesn't alter core code
45 beyond the ready/depends-exact branch point, so the results should be
50 > We discussed this on irc; I had some worries that things may have been
51 > switched to `add_depends_exact` that were not pure page names. My current
52 > feeling is it's all safe, but who knows. It's easy to miss something.
53 > Which makes me think this is not a good interface.
55 > Why not, instead, make `add_depends` smart. If it's passed something
56 > that is clearly a raw page name, it can add it to the exact depends hash.
57 > Else, add it to the pagespec hash. You can tell if it's a pure page name
58 > by matching on `$config{wiki_file_regexp}`.
60 > Also I think there may be little optimisation value left in
61 > 7227c2debfeef94b35f7d81f42900aa01820caa3, since the "regular" dependency
62 > lists will be much shorter.
64 > Sounds like inline pagenames has an already exstant bug WRT
65 > pages moving, which this should not make worse. Would be good to verify.
67 > Re coding, it would be nice if `refresh()` could avoid duplicating
68 > the debug message, etc in the two cases. --[[Joey]]