X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/48c09d44637dd724d084b1d06e2277f11e80d489..b51703569d35790f31dccc3dc2921e8bcccd5b49:/doc/todo/allow_plugins_to_add_sorting_methods.mdwn diff --git a/doc/todo/allow_plugins_to_add_sorting_methods.mdwn b/doc/todo/allow_plugins_to_add_sorting_methods.mdwn index 99f256fbe..739a3d6b0 100644 --- a/doc/todo/allow_plugins_to_add_sorting_methods.mdwn +++ b/doc/todo/allow_plugins_to_add_sorting_methods.mdwn @@ -1,4 +1,3 @@ -[[!template id=gitbranch branch=smcv/sort-hooks author="[[Simon_McVittie|smcv]]"]] [[!tag patch]] The available [[ikiwiki/pagespec/sorting]] methods are currently hard-coded in @@ -11,48 +10,240 @@ title over the page name, but for compatibility, I'm not going to (I do wonder whether it would be worth making sort=name an alias for the current sort=title, and changing the meaning of sort=title in 4.0, though). -Gitweb: - +*[sort-hooks branch now withdrawn in favour of sort-package --s]* I briefly tried to turn *all* the current sort types into hook functions, and have some of them pre-registered, but decided that probably wasn't a good idea. That earlier version of the branch is also available for comparison: - +*[also withdrawn in favour of sort-package --s]* + +>> I wonder if IkiWiki would benefit from the concept of a "sortspec", like a [[ikiwiki/PageSpec]] but dedicated to sorting lists of pages rather than defining lists of pages? Rather than defining a sort-hook, define a SortSpec class, and enable people to add their own sort methods as functions defined inside that class, similarly to the way they can add their own pagespec definitions. --[[KathrynAndersen]] + +>>> [[!template id=gitbranch branch=smcv/sort-package author="[[Simon_McVittie|smcv]]"]] +>>> I'd be inclined to think that's overkill, but it wasn't very hard to +>>> implement, and in a way is more elegant. I set it up so sort mechanisms +>>> share the `IkiWiki::PageSpec` package, but with a `cmp_` prefix. Gitweb: +>>> + +>>>> I agree it seems more elegant, so I have focused on it. +>>>> +>>>> I don't know about reusing `IkiWiki::PageSpec` for this. +>>>> --[[Joey]] + +>>>>> Fair enough, `IkiWiki::SortSpec::cmp_foo` would be just +>>>>> as easy, or `IkiWiki::Sorting::cmp_foo` if you don't like +>>>>> introducing "sort spec" in the API. I took a cue from +>>>>> [[ikiwiki/pagespec/sorting]] being a subpage of +>>>>> [[ikiwiki/pagespec]], and decided that yes, sorting is +>>>>> a bit like a pagespec :-) Which name would you prefer? --s + +>>>>>> `SortSpec` --[[Joey]] + +>>>>>>> Done. --s + +>>>> I would be inclined to drop the `check_` stuff. --[[Joey]] + +>>>>> It basically exists to support `title_natural`, to avoid +>>>>> firing up the whole import mechanism on every `cmp` +>>>>> (although I suppose that could just be a call to a +>>>>> memoized helper function). It also lets sort specs that +>>>>> *must* have a parameter, like +>>>>> [[field|plugins/contrib/field/discussion]], fail early +>>>>> (again, not so valuable). +>>>>> +>>>>>> AFAIK, `use foo` has very low overhead when the module is already +>>>>>> loaded. There could be some evalation overhead in `eval q{use foo}`, +>>>>>> if so it would be worth addressing across the whole codebase. +>>>>>> --[[Joey]] +>>>>>> +>>>>>>> check_cmp_foo now dropped. --s +>>>>> +>>>>> The former function could be achieved at a small +>>>>> compatibility cost by putting `title_natural` in a new +>>>>> `sortnatural` plugin (that fails to load if you don't +>>>>> have `title_natural`), if you'd prefer - that's what would +>>>>> have happened if `title_natural` was written after this +>>>>> code had been merged, I suspect. Would you prefer this? --s + +>>>>>> Yes! (Assuming it does not make sense to support +>>>>>> natural order sort of other keys than the title, at least..) +>>>>>> --[[Joey]] + +>>>>>>> Done. I added some NEWS.Debian for it, too. --s + +>>>> Wouldn't it make sense to have `meta(title)` instead +>>>> of `meta_title`? --[[Joey]] + +>>>>> Yes, you're right. I added parameters to support `field`, +>>>>> and didn't think about making `meta` use them too. +>>>>> However, `title` does need a special case to make it +>>>>> default to the basename instead of the empty string. +>>>>> +>>>>> Another special case for `title` is to use `titlesort` +>>>>> first (the name `titlesort` is derived from Ogg/FLAC +>>>>> tags, which can have `titlesort` and `artistsort`). +>>>>> I could easily extend that to other metas, though; +>>>>> in fact, for e.g. book lists it would be nice for +>>>>> `field(bookauthor)` to behave similarly, so you can +>>>>> display "Douglas Adams" but sort by "Adams, Douglas". +>>>>> +>>>>> `meta_title` is also meant to be a prototype of how +>>>>> `sort=title` could behave in 4.0 or something - sorting +>>>>> by page name (which usually sorts in approximately the +>>>>> same place as the meta-title, but occasionally not), while +>>>>> displaying meta-titles, does look quite odd. --s + +>>>>>> Agreed. --[[Joey]] + +>>>>>>> I've implemented meta(title). meta(author) also has the +>>>>>>> `sortas` special case; meta(updated) and meta(date) +>>>>>>> should also work how you'd expect them to (but they're +>>>>>>> earliest-first, unlike age). --s + +>>>> As I read the regexp in `cmpspec_translate`, the "command" +>>>> is required to have params. They should be optional, +>>>> to match the documentation and because most sort methods +>>>> do not need parameters. --[[Joey]] + +>>>>> No, `$2` is either `\w+\([^\)]*\)` or `[^\s]+` (with the +>>>>> latter causing an error later if it doesn't also match `\w+`). +>>>>> This branch doesn't add any parameterized sort methods, +>>>>> in fact, although I did provide one on +>>>>> [[field's_discussion_page|plugins/contrib/report/discussion]]. --s + +>>>> I wonder if it would make sense to add some combining keywords, so +>>>> a sortspec reads like `sort="age then ascending title"` +>>>> In a way, this reduces the amount of syntax that needs to be learned. +>>>> I like the "then" (and it could allow other operations than +>>>> simple combination, if any others make sense). Not so sure about the +>>>> "ascending", which could be "reverse" instead, but "descending age" and +>>>> "ascending age" both seem useful to be able to explicitly specify. +>>>> --[[Joey]] + +>>>>> Perhaps. I do like the simplicity of [[KathrynAndersen]]'s syntax +>>>>> from [[plugins/contrib/report]] (which I copied verbatim, except for +>>>>> turning sort-by-`field` into a parameterized spec). +>>>>> +>>>>> If we're getting into English-like (or at least SQL-like) queries, +>>>>> it might make sense to change the signature of the hook function +>>>>> so it's a function to return a key, e.g. +>>>>> `sub key_age { return -%pagemtime{$_[0]) }`. Then we could sort like +>>>>> this: +>>>>> +>>>>> field(artistsort) or field(artist) or constant(Various Artists) then meta(titlesort) or meta(title) or title +>>>>> +>>>>> with "or" binding more closely than "then". Does this seem valuable? +>>>>> I think the implementation would be somewhat more difficult. and +>>>>> it's probably getting too complicated to be worthwhile, though? +>>>>> (The keys that actually benefit from this could just +>>>>> have smarter cmp functions, I think.) +>>>>> +>>>>> If the hooks return keys rather than cmp results, then we could even +>>>>> have "lowercase" as an adjective used like "ascending"... maybe. +>>>>> However, there are two types of adjective here: "lowercase" +>>>>> really applies to the keys, whereas "ascending" applies to the "cmp" +>>>>> result. Again, I think this is getting too complex, and could just +>>>>> be solved with smarter cmp functions. +>>>>> +>>>>>> I agree. (Also, I think returning keys may make it harder to write +>>>>>> smarter cmp functions.) --[[Joey]] +>>>>> +>>>>> Unfortunately, `sort="ascending mtime"` actually sorts by *descending* +>>>>> timestamp (but`sort=age` is fine, because `age` could be defined as +>>>>> now minus `ctime`). `sort=freshness` isn't right either, because +>>>>> "sort by freshness" seems as though it ought to mean freshest first, +>>>>> but "sort by ascending freshness" means put the least fresh first. If +>>>>> we have ascending and descending keywords which are optional, I don't +>>>>> think we really want different sort types to have different default +>>>>> directions - it seems clearer to have `ascending` always be a no-op, +>>>>> and `descending` always negate. +>>>>> +>>>>>> I think you've convinced me that ascending/descending impose too +>>>>>> much semantics on it, so "-" is better. --[[Joey]] + +>>>>>>> I've kept the semantics from `report` as-is, then: +>>>>>>> e.g. `sort="age -title"`. --s + +>>>>> +>>>>> Perhaps we could borrow from `meta updated` and use `update_age`? +>>>>> `updateage` would perhaps be a more normal IkiWiki style - but that +>>>>> makes me think that updateage is a quantity analagous to tonnage or +>>>>> voltage, with more or less recently updated pages being said to have +>>>>> more or less updateage. I don't know whether that's good or bad :-) +>>>>> +>>>>> I'm sure there's a much better word, but I can't see it. Do you have +>>>>> a better idea? --s + +[Regarding the `meta title=foo sort=bar` special case] + +> I feel it sould be clearer to call that "sortas", since "sort=" is used +> to specify a sort method in other directives. --[[Joey]] +>> Done. --[[smcv]] + +## speed + +I notice the implementation does not use the magic `$a` and `$b` globals. +That nasty perl optimisation is still worthwhile: + + perl -e 'use warnings; use strict; use Benchmark; sub a { $a <=> $b } sub b ($$) { $_[0] <=> $_[1] }; my @list=reverse(1..9999); timethese(10000, {a => sub {my @f=sort a @list}, b => sub {my @f=sort b @list}, c => => sub {my @f=sort { b($a,$b) } @list}})' + Benchmark: timing 10000 iterations of a, b, c... + a: 80 wallclock secs (76.74 usr + 0.05 sys = 76.79 CPU) @ 130.23/s (n=10000) + b: 112 wallclock secs (106.14 usr + 0.20 sys = 106.34 CPU) @ 94.04/s (n=10000) + c: 330 wallclock secs (320.25 usr + 0.17 sys = 320.42 CPU) @ 31.21/s (n=10000) + +Unfortunatly, I think that c is closest to the new implementation. +--[[Joey]] + +## Documentation from sort-package branch + +### advanced sort orders (conditionally added to [[ikiwiki/pagespec/sorting]]) + +* `title_natural` - Orders by title, but numbers in the title are treated + as such, ("1 2 9 10 20" instead of "1 10 2 20 9") +* `meta(title)` - Order according to the `\[[!meta title="foo" sortas="bar"]]` + or `\[[!meta title="foo"]]` [[ikiwiki/directive]], or the page name if no + full title was set. `meta(author)`, `meta(date)`, `meta(updated)`, etc. + also work. -(The older version is untested, and probably doesn't really work as-is - I -misunderstood the details of how the built-in function `sort` works when using -`$a` and `$b`. The newer version has been tested, and has a regression test for -its core functionality.) +### Multiple sort orders (added to [[ikiwiki/pagespec/sorting]]) -## Documentation extracted from the branch +In addition, you can combine several sort orders and/or reverse the order of +sorting, with a string like `age -title` (which would sort by age, then by +title in reverse order if two pages have the same age). -### sort hook (added to [[plugins/write]]) +### meta sortas parameter (added to [[ikiwiki/directive/meta]]) - hook(type => "sort", id => "foo", call => \&sort_by_foo); +[in title] -This hook adds an additional [[ikiwiki/pagespec/sorting]] order or overrides -an existing one. The callback is given two page names as arguments, and -returns negative, zero or positive if the first page should come before, -close to (i.e. undefined order), or after the second page. +An optional `sort` parameter will be used preferentially when +[[ikiwiki/pagespec/sorting]] by `meta(title)`: -For instance, the built-in `title` sort order could be reimplemented as + \[[!meta title="The Beatles" sort="Beatles, The"]] - sub sort_by_title { - pagetitle(basename($_[0])) cmp pagetitle(basename($_[1])); - } + \[[!meta title="David Bowie" sort="Bowie, David"]] -### meta_title sort order (conditionally added to [[ikiwiki/pagespec/sorting]]) +[in author] -* `meta_title` - Order according to the `\[[!meta title="foo" sort="bar"]]` - or `\[[!meta title="foo"]]` [[ikiwiki/directive]], or the page name if no - full title was set. + An optional `sortas` parameter will be used preferentially when + [[ikiwiki/pagespec/sorting]] by `meta(author)`: -### meta title sort parameter (added to [[ikiwiki/directive/meta]]) + \[[!meta author="Joey Hess" sortas="Hess, Joey"]] -An optional `sort` parameter will be used preferentially when -[[ikiwiki/pagespec/sorting]] by `meta_title`: +### Sorting plugins (added to [[plugins/write]]) - \[[!meta title="The Beatles" sort="Beatles, The"]] +Similarly, it's possible to write plugins that add new functions as +[[ikiwiki/pagespec/sorting]] methods. To achieve this, add a function to +the IkiWiki::SortSpec package named `cmp_foo`, which will be used when sorting +by `foo` or `foo(...)` is requested. - \[[!meta title="David Bowie" sort="Bowie, David"]] +The function will be passed three or more parameters. The first two are +page names, and the third is `undef` if invoked as `foo`, or the parameter +`"bar"` if invoked as `foo(bar)`. It may also be passed additional, named +parameters. + +It should return the same thing as Perl's `cmp` and `<=>` operators: negative +if the first argument is less than the second, positive if the first argument +is greater, or zero if they are considered equal. It may also raise an +error using `error`, for instance if it needs a parameter but one isn't +provided.