whether it would be worth making sort=name an alias for the current sort=title,
and changing the meaning of sort=title in 4.0, though).
+> What compatability concerns, exactly, are there that prevent making that
+> change now? --[[Joey]]
+
*[sort-hooks branch now withdrawn in favour of sort-package --s]*
I briefly tried to turn *all* the current sort types into hook functions, and
>> I wonder if IkiWiki would benefit from the concept of a "sortspec", like a [[ikiwiki/PageSpec]] but dedicated to sorting lists of pages rather than defining lists of pages? Rather than defining a sort-hook, define a SortSpec class, and enable people to add their own sort methods as functions defined inside that class, similarly to the way they can add their own pagespec definitions. --[[KathrynAndersen]]
->>> [[!template id=gitbranch branch=smcv/sort-package author="[[Simon_McVittie|smcv]]"]]
+>>> [[!template id=gitbranch branch=smcv/ready/sort-package author="[[Simon_McVittie|smcv]]"]]
>>> I'd be inclined to think that's overkill, but it wasn't very hard to
>>> implement, and in a way is more elegant. I set it up so sort mechanisms
>>> share the `IkiWiki::PageSpec` package, but with a `cmp_` prefix. Gitweb:
>>>>> introducing "sort spec" in the API. I took a cue from
>>>>> [[ikiwiki/pagespec/sorting]] being a subpage of
>>>>> [[ikiwiki/pagespec]], and decided that yes, sorting is
->>>>> a bit like a pagespec :-) --s
+>>>>> a bit like a pagespec :-) Which name would you prefer? --s
+
+>>>>>> `SortSpec` --[[Joey]]
+
+>>>>>>> [[Done]]. --s
->>>> I would be inclined to drop the `check_` stuff.
+>>>> I would be inclined to drop the `check_` stuff. --[[Joey]]
>>>>> It basically exists to support `title_natural`, to avoid
>>>>> firing up the whole import mechanism on every `cmp`
>>>>> [[field|plugins/contrib/field/discussion]], fail early
>>>>> (again, not so valuable).
>>>>>
+>>>>>> AFAIK, `use foo` has very low overhead when the module is already
+>>>>>> loaded. There could be some evalation overhead in `eval q{use foo}`,
+>>>>>> if so it would be worth addressing across the whole codebase.
+>>>>>> --[[Joey]]
+>>>>>>
+>>>>>>> check_cmp_foo now dropped. --s
+>>>>>
>>>>> The former function could be achieved at a small
>>>>> compatibility cost by putting `title_natural` in a new
->>>>> sortnatural plugin (that fails to load if you don't
+>>>>> `sortnatural` plugin (that fails to load if you don't
>>>>> have `title_natural`), if you'd prefer - that's what would
>>>>> have happened if `title_natural` was written after this
->>>>> code had been merged, I suspect. --s
+>>>>> code had been merged, I suspect. Would you prefer this? --s
+
+>>>>>> Yes! (Assuming it does not make sense to support
+>>>>>> natural order sort of other keys than the title, at least..)
+>>>>>> --[[Joey]]
+
+>>>>>>> Done. I added some NEWS.Debian for it, too. --s
>>>> Wouldn't it make sense to have `meta(title)` instead
->>>> of `meta_title`? --J
+>>>> of `meta_title`? --[[Joey]]
>>>>> Yes, you're right. I added parameters to support `field`,
>>>>> and didn't think about making `meta` use them too.
>>>>> same place as the meta-title, but occasionally not), while
>>>>> displaying meta-titles, does look quite odd. --s
+>>>>>> Agreed. --[[Joey]]
+
+>>>>>>> I've implemented meta(title). meta(author) also has the
+>>>>>>> `sortas` special case; meta(updated) and meta(date)
+>>>>>>> should also work how you'd expect them to (but they're
+>>>>>>> earliest-first, unlike age). --s
+
>>>> As I read the regexp in `cmpspec_translate`, the "command"
>>>> is required to have params. They should be optional,
>>>> to match the documentation and because most sort methods
->>>> do not need parameters. --J
+>>>> do not need parameters. --[[Joey]]
>>>>> No, `$2` is either `\w+\([^\)]*\)` or `[^\s]+` (with the
>>>>> latter causing an error later if it doesn't also match `\w+`).
>>>> "ascending age" both seem useful to be able to explicitly specify.
>>>> --[[Joey]]
->>>>> Perhaps. I do like the simplicity of [[KathyrnAndersen]]'s syntax
+>>>>> Perhaps. I do like the simplicity of [[KathrynAndersen]]'s syntax
>>>>> from [[plugins/contrib/report]] (which I copied verbatim, except for
->>>>> turning sort-by-`field` into a parameterized spec), and I can't really
->>>>> think of any sensible way to combine sort specs other than "sort by a,
->>>>> break ties by b, ...", possibly with some reversals thrown in.
+>>>>> turning sort-by-`field` into a parameterized spec).
+>>>>>
+>>>>> If we're getting into English-like (or at least SQL-like) queries,
+>>>>> it might make sense to change the signature of the hook function
+>>>>> so it's a function to return a key, e.g.
+>>>>> `sub key_age { return -%pagemtime{$_[0]) }`. Then we could sort like
+>>>>> this:
+>>>>>
+>>>>> field(artistsort) or field(artist) or constant(Various Artists) then meta(titlesort) or meta(title) or title
>>>>>
->>>>> If no other combinations do make sense, is your proposal that "then"
->>>>> is entirely redundant (easy, just make it a predefined sort spec that
->>>>> returns 0!), or that it's mandatory "punctuation" (add an explicit
->>>>> check, or make "then" expand to "||" and let Perl fail to compile
->>>>> the generated code if it's omitted)?
+>>>>> with "or" binding more closely than "then". Does this seem valuable?
+>>>>> I think the implementation would be somewhat more difficult. and
+>>>>> it's probably getting too complicated to be worthwhile, though?
+>>>>> (The keys that actually benefit from this could just
+>>>>> have smarter cmp functions, I think.)
>>>>>
->>>>> It is a little unfortunate that reversal has to move into the sort
->>>>> spec - I prefer `reverse=yes` - but that's necessary for multi-level
->>>>> sorting. I can see your point about ascending/descending being more
->>>>> obvious to look at, but they're also considerably more verbose.
+>>>>> If the hooks return keys rather than cmp results, then we could even
+>>>>> have "lowercase" as an adjective used like "ascending"... maybe.
+>>>>> However, there are two types of adjective here: "lowercase"
+>>>>> really applies to the keys, whereas "ascending" applies to the "cmp"
+>>>>> result. Again, I think this is getting too complex, and could just
+>>>>> be solved with smarter cmp functions.
+>>>>>
+>>>>>> I agree. (Also, I think returning keys may make it harder to write
+>>>>>> smarter cmp functions.) --[[Joey]]
>>>>>
>>>>> Unfortunately, `sort="ascending mtime"` actually sorts by *descending*
>>>>> timestamp (but`sort=age` is fine, because `age` could be defined as
>>>>> directions - it seems clearer to have `ascending` always be a no-op,
>>>>> and `descending` always negate.
>>>>>
+>>>>>> I think you've convinced me that ascending/descending impose too
+>>>>>> much semantics on it, so "-" is better. --[[Joey]]
+
+>>>>>>> I've kept the semantics from `report` as-is, then:
+>>>>>>> e.g. `sort="age -title"`. --s
+
>>>>> Perhaps we could borrow from `meta updated` and use `update_age`?
>>>>> `updateage` would perhaps be a more normal IkiWiki style - but that
>>>>> makes me think that updateage is a quantity analagous to tonnage or
>>>>> voltage, with more or less recently updated pages being said to have
>>>>> more or less updateage. I don't know whether that's good or bad :-)
>>>>>
->>>>> I'm sure there's a much better word, but I can't see it. --s
+>>>>> I'm sure there's a much better word, but I can't see it. Do you have
+>>>>> a better idea? --s
+
+[Regarding the `meta title=foo sort=bar` special case]
+
+> I feel it sould be clearer to call that "sortas", since "sort=" is used
+> to specify a sort method in other directives. --[[Joey]]
+>> Done. --[[smcv]]
+
+## speed
+
+I notice the implementation does not use the magic `$a` and `$b` globals.
+That nasty perl optimisation is still worthwhile:
+
+ perl -e 'use warnings; use strict; use Benchmark; sub a { $a <=> $b } sub b ($$) { $_[0] <=> $_[1] }; my @list=reverse(1..9999); timethese(10000, {a => sub {my @f=sort a @list}, b => sub {my @f=sort b @list}, c => => sub {my @f=sort { b($a,$b) } @list}})'
+ Benchmark: timing 10000 iterations of a, b, c...
+ a: 80 wallclock secs (76.74 usr + 0.05 sys = 76.79 CPU) @ 130.23/s (n=10000)
+ b: 112 wallclock secs (106.14 usr + 0.20 sys = 106.34 CPU) @ 94.04/s (n=10000)
+ c: 330 wallclock secs (320.25 usr + 0.17 sys = 320.42 CPU) @ 31.21/s (n=10000)
+
+Unfortunatly, I think that c is closest to the new implementation.
+--[[Joey]]
+
+> Unfortunately, `$a` isn't always `$main::a` - it's `$Package::a` where
+> `Package` is the call site of the sort call. This was a showstopper when
+> `sort` was a hook implemented in many packages, but now that it's a
+> `SortSpec`, I may be able to fix this by putting a `sort` wrapper in the
+> `SortSpec` namespace, so it's like this:
+>
+> sub sort ($@)
+> {
+> my $cmp = shift;
+> return sort $cmp @_;
+> }
+>
+> which would mean that the comparison used `$IkiWiki::SortSpec::a`.
+> --s
+
+>> I've now done this. On a wiki with many [[plugins/contrib/album]]s
+>> (a full rebuild takes half an hour!), I tested a refresh after
+>> `touch tags/*.mdwn` (my tag pages contain inlines of the form
+>> `tagged(foo)` sorted by date, so they exercise sorting).
+>> I also tried removing sorting from `pagespec_match_list`
+>> altogether, as an upper bound for how fast we can possibly make it.
+>>
+>> * `master` at branch point: 63.72user 0.29system
+>> * `master` at branch point: 63.91user 0.37system
+>> * my branch, with `@_`: 65.28user 0.29system
+>> * my branch, with `@_`: 65.21user 0.28system
+>> * my branch, with `$a`: 64.09user 0.28system
+>> * my branch, with `$a`: 63.83user 0.36system
+>> * not sorted at all: 58.99user 0.29system
+>> * not sorted at all: 58.92user 0.29system
+>>
+>> --s
+
+> I do notice that `pagespec_match_list` performs the sort before the
+> filter by pagespec. Is this a deliberate design choice, or
+> coincidence? I can see that when `limit` is used, this could be
+> used to only run the pagespec match function until `limit` pages
+> have been selected, but the cost is that every page in the wiki
+> is sorted. Or, it might be useful to do the filtering first, then
+> sort the sub-list thus produced, then finally apply the limit? --s
+
+>> Yes, it was deliberate, pagespec matching can be expensive enough that
+>> needing to sort a lot of pages seems likely to be less work. (I don't
+>> remember what benchmarking was done though.) --[[Joey]]
+
+>>> We discussed this on IRC and Joey pointed out that this also affects
+>>> dependency calculation, so I'm not going to get into this now... --s
+
+Joey pointed out on IRC that the `titlesort` feature duplicates all the
+meta titles. I did that in order to sort by the unescaped version, but
+I've now changed the branch to only store that if it makes a difference.
+--s
## Documentation from sort-package branch
-### meta_title sort order (conditionally added to [[ikiwiki/pagespec/sorting]])
+### advanced sort orders (conditionally added to [[ikiwiki/pagespec/sorting]])
-* `meta_title` - Order according to the `\[[!meta title="foo" sort="bar"]]`
+* `title_natural` - Orders by title, but numbers in the title are treated
+ as such, ("1 2 9 10 20" instead of "1 10 2 20 9")
+* `meta(title)` - Order according to the `\[[!meta title="foo" sortas="bar"]]`
or `\[[!meta title="foo"]]` [[ikiwiki/directive]], or the page name if no
- full title was set.
-
- > I feel it sould be clearer to call that "sortas", since "sort=" is used
- > to specify a sort method in other directives. --[[Joey]]
-
- >> Fair enough, that's easy to do. --[[smcv]]
+ full title was set. `meta(author)`, `meta(date)`, `meta(updated)`, etc.
+ also work.
### Multiple sort orders (added to [[ikiwiki/pagespec/sorting]])
sorting, with a string like `age -title` (which would sort by age, then by
title in reverse order if two pages have the same age).
-### meta title sort parameter (added to [[ikiwiki/directive/meta]])
+### meta sortas parameter (added to [[ikiwiki/directive/meta]])
+
+[in title]
An optional `sort` parameter will be used preferentially when
-[[ikiwiki/pagespec/sorting]] by `meta_title`:
+[[ikiwiki/pagespec/sorting]] by `meta(title)`:
\[[!meta title="The Beatles" sort="Beatles, The"]]
\[[!meta title="David Bowie" sort="Bowie, David"]]
+[in author]
+
+ An optional `sortas` parameter will be used preferentially when
+ [[ikiwiki/pagespec/sorting]] by `meta(author)`:
+
+ \[[!meta author="Joey Hess" sortas="Hess, Joey"]]
+
### Sorting plugins (added to [[plugins/write]])
Similarly, it's possible to write plugins that add new functions as
[[ikiwiki/pagespec/sorting]] methods. To achieve this, add a function to
-the IkiWiki::PageSpec package named `cmp_foo`, which will be used when sorting
+the IkiWiki::SortSpec package named `cmp_foo`, which will be used when sorting
by `foo` or `foo(...)` is requested.
-The function will be passed three or more parameters. The first two are
-page names, and the third is `undef` if invoked as `foo`, or the parameter
-`"bar"` if invoked as `foo(bar)`. It may also be passed additional, named
-parameters.
-
-It should return the same thing as Perl's `cmp` and `<=>` operators: negative
-if the first argument is less than the second, positive if the first argument
-is greater, or zero if they are considered equal. It may also raise an
-error using `error`, for instance if it needs a parameter but one isn't
-provided.
-
-You can also define a function called `check_cmp_foo` in the same package.
-If you do, it will be called while preparing to sort by `foo` or `foo(bar)`,
-with argument `undef` or `"bar"` respectively; it may raise an error using
-`error`, if sorting like that isn't going to work.
+The names of pages to be compared are in the global variables `$a` and `$b`
+in the IkiWiki::SortSpec package. The function should return the same thing
+as Perl's `cmp` and `<=>` operators: negative if `$a` is less than `$b`,
+positive if `$a` is greater, or zero if they are considered equal. It may
+also raise an error using `error`, for instance if it needs a parameter but
+one isn't provided.
+
+The function will also be passed one or more parameters. The first is
+`undef` if invoked as `foo`, or the parameter `"bar"` if invoked as `foo(bar)`;
+it may also be passed additional, named parameters.