X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/7bfa77380ac1fda68d224343c46b310779ce9980..6cde77d5d46fd8b7e3fa711fa535c6a3de834b1c:/doc/todo/multi-thread_ikiwiki.mdwn diff --git a/doc/todo/multi-thread_ikiwiki.mdwn b/doc/todo/multi-thread_ikiwiki.mdwn index 3838103ff..358185a22 100644 --- a/doc/todo/multi-thread_ikiwiki.mdwn +++ b/doc/todo/multi-thread_ikiwiki.mdwn @@ -35,3 +35,55 @@ Disclaimer: I know nothing of the Perl approach to parallel processing. > > about meet the benefit of most of the threading/async work. > > > > --[[tychoish]] + +>>> It's at this point that doing profiling for a particular site would come +>>> in, because it would depend on the site content and how exactly IkiWiki is +>>> being used as to what the performance bottlenecks would be. For the +>>> original poster, it would be image processing. For me, it tends to be +>>> PageSpecs, because I have a lot of maps and reports. + +>>> But I sincerely don't think that Disk I/O is the main bottleneck, not when +>>> the original poster mentions CPU usage, and also in my experience, I see +>>> IkiWiki chewing up 100% CPU usage one CPU, while the others remain idle. I +>>> haven't noticed slowdowns due to waiting for disk I/O, whether that be a +>>> system with HD or SSD storage. + +>>> I agree that large sites are probably not the most common use-case, but it +>>> can be a chicken-and-egg situation with large sites and complete rebuilds, +>>> since it can often be the case with a large site that rebuilding based on +>>> dependencies takes *longer* than rebuilding the site from scratch, simply +>>> because there are so many pages that are interdependent. It's not always +>>> the number of pages itself, but how the site is being used. If IkiWiki is +>>> used with the absolute minimum number of page-dependencies - that is, no +>>> maps, no sitemaps, no trails, no tags, no backlinks, no albums - then one +>>> can have a very large number of pages without having performance problems. +>>> But when you have a change in PageA affecting PageB which affects PageC, +>>> PageD, PageE and PageF, then performance can drop off horribly. And it's a +>>> trade-off, because having features that interlink pages automatically is +>>> really nifty ad useful - but they have a price. + +>>> I'm not really sure what the best solution is. Me, I profile my IkiWiki builds and try to tweak performance for them... but there's only so much I can do. +>>> --[[KathrynAndersen]] + +>>>> IMHO, the best way to get a multithreaded ikiwiki is to rewrite it +>>>> in haskell, using as much pure code as possible. Many avenues +>>>> then would open up to taking advantage of haskell's ability to +>>>> parallize pure code. +>>>> +>>>> With that said, we already have some nice invariants that could be +>>>> used to parallelize page builds. In particular, we know that +>>>> page A never needs state built up while building page B, for any +>>>> pages A and B that don't have a dependency relationship -- and ikiwiki +>>>> tracks such dependency relationships, although not currently in a form +>>>> that makes it very easy (or fast..) to pick out such groups of +>>>> unrelated pages. +>>>> +>>>> OTOH, there are problems.. building page A can result in changes to +>>>> ikiwiki's state; building page B can result in other changes. All +>>>> such changes would have to be made thread-safely. And would the +>>>> resulting lock contention result in a program that ran any faster +>>>> once parallelized? +>>>> +>>>> Which is why [[rewrite_ikiwiki_in_haskell]], while pretty insane, is +>>>> something I keep thinking about. If only I had a spare year.. +>>>> --[[Joey]]