X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/53453ebdb2c6624abcbb56a2f9605fffcd326c77..3fe434189b99daf3d8cc075186c4ed4ebdd43b8f:/doc/todo/structured_page_data.mdwn?ds=sidebyside diff --git a/doc/todo/structured_page_data.mdwn b/doc/todo/structured_page_data.mdwn index 59756152f..9f21fab7f 100644 --- a/doc/todo/structured_page_data.mdwn +++ b/doc/todo/structured_page_data.mdwn @@ -1,6 +1,8 @@ This is an idea from [[JoshTriplett]]. --[[Joey]] -Some uses of ikiwiki, such as for a BTS, move a bit away from the wiki end +* See further discussion at [[forum/an_alternative_approach_to_structured_data]]. + +Some uses of ikiwiki, such as for a bug-tracking system (BTS), move a bit away from the wiki end of the spectrum, and toward storing structured data about a page or instead of a page. @@ -45,13 +47,587 @@ This seems conceptually straightforward, if possibly quite internally complex to handle the more complicated types and validation. One implementation wrinkle is how to build the html form. The editpage.tmpl -currently overrides the standard [[cpan CGI::FormBuilder]] generated form, +currently overrides the standard [[!cpan CGI::FormBuilder]] generated form, which was done to make the edit page be laid out in a nice way. This, however, means that new fields cannot be easily added to it using -[[cpan CGI::FormBuilder]]. The attachment plugin uses the hack of bouilding +[[!cpan CGI::FormBuilder]]. The attachment plugin uses the hack of bouilding up html by hand and dumping it into the form via a template variable. It would be nice if the type implementation code could just use FormBuilder, since its automatic form generation, and nice field validation model is a perfect match for structured data. But this problem with editpage.tmpl would have to be sorted out to allow that. + +Additional tie-ins: + +* Pagespecs that can select pages with a field with a given value, etc. + This should use a pagespec function like field(fieldname, value). The + semantics of this will depend on the type of the field; text fields will + match value against the text, and link fields will check for a link + matching the pagespec value. +* The search plugin could allow searching for specific fields with specific + content. (xapian term search is a good fit). + +See also: + +[[tracking_bugs_with_dependencies]] + +> I was also thinking about this for bug tracking. I'm not sure what +> sort of structured data is wanted in a page, so I decided to brainstorm +> use cases: +> +> * You just want the page to be pretty. +> * You want to access the data from another page. This would be almost like +> like a database lookup, or the OpenOffice Calc [VLookup](http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_VLOOKUP_function) function. +> * You want to make a pagespec depend upon the data. This could be used +> for dependancy tracking - you could match against pages listed as dependencies, +> rather than all pages linked from a given page. +> +>The first use case is handled by having a template in the page creation. You could + + + + +>have some type of form to edit the data, but that's just sugar on top of the template. +>If you were going to have a web form to edit the data, I can imagine a few ways to do it: +> +> * Have a special page type which gets compiled into the form. The page type would +> need to define the form as well as hold the stored data. +> * Have special directives that allow you to insert form elements into a normal page. +> +>I'm happy with template based page creation as a first pass... +> +>The second use case could be handled by a regular expression directive. eg: +> +> \[[regex spec="myBug" regex="Depends: ([^\s]+)"]] +> +> The directive would be replaced with the match from the regex on the 'myBug' page... or something. +> +>The third use case requires a pagespec function. One that matched a regex in the page might work. +>Otherwise, another option would be to annotate links with a type, and then check the type of links in +>a pagespec. e.g. you could have `depends` links and normal links. +> +>Anyway, I just wanted to list the thoughts. In none of these use cases is straight yaml or json the +>obvious answer. -- [[Will]] + +>> Okie. I've had a play with this. A 'form' plugin is included inline below, but it is only a rough first pass to +>> get a feel for the design space. +>> +>> The current design defines a new type of page - a 'form'. The type of page holds YAML data +>> defining a FormBuilder form. For example, if we add a file to the wiki source `test.form`: + + --- + fields: + age: + comment: This is a test + validate: INT + value: 15 + +>> The YAML content is a series of nested hashes. The outer hash is currently checked for two keys: +>> 'template', which specifies a parameter to pass to the FromBuilder as the template for the +>> form, and 'fields', which specifies the data for the fields on the form. +>> each 'field' is itself a hash. The keys and values are arguments to the formbuilder form method. +>> The most important one is 'value', which specifies the value of that field. +>> +>> Using this, the plugin below can output a form when asked to generate HTML. The Formbuilder +>> arguments are sanitized (need a thorough security audit here - I'm sure I've missed a bunch of +>> holes). The form is generated with default values as supplied in the YAML data. It also has an +>> 'Update Form' button at the bottom. +>> +>> The 'Update Form' button in the generated HTML submits changed values back to IkiWiki. The +>> plugin captures these new values, updates the YAML and writes it out again. The form is +>> validated when edited using this method. This method can only edit the values in the form. +>> You cannot add new fields this way. +>> +>> It is still possible to edit the YAML directly using the 'edit' button. This allows adding new fields +>> to the form, or adding other formbuilder data to change how the form is displayed. +>> +>> One final part of the plugin is a new pagespec function. `form_eq()` is a pagespec function that +>> takes two arguments (separated by a ','). The first argument is a field name, the second argument +>> a value for that field. The function matches forms (and not other page types) where the named +>> field exists and holds the value given in the second argument. For example: + + \[[!inline pages="form_eq(age,15)" archive="yes"]] + +>> will include a link to the page generated above. + +>>> Okie, I've just made another plugin to try and do things in a different way. +>>> This approach adds a 'data' directive. There are two arguments, `key` and `value`. +>>> The directive is replaced by the value. There is also a match function, which is similar +>>> to the one above. It also takes two arguments, a key and a value. It returns true if the +>>> page has that key/value pair in a data directive. e.g.: + + \[[!data key="age" value="15"]] + +>>> then, in another page: + + \[[!inline pages="data_eq(age,15)" archive="yes"]] + +>>> I expect that we could have more match functions for each type of structured data, +>>> I just wanted to implement a rough prototype to get a feel for how it behaves. -- [[Will]] + +>> Anyway, here are the plugins. As noted above these are only preliminary, exploratory, attempts. -- [[Will]] + +>>>> I've just updated the second of the two patches below. The two patches are not mutually +>>>> exclusive, but I'm leaning towards the second as more useful (for the things I'm doing). -- [[Will]] + +I think it's awesome that you're writing this code to explore the problem +space, [[Will]] -- and these plugins are good stabs at at least part of it. +Let me respond to a few of your comments.. --[[Joey]] + +On use cases, one use case is a user posting a bug report with structured +data in it. A template is one way, but then the user has to deal with the +format used to store the structured data. This is where a edit-time form +becomes essential. + +> This was the idea with the 'form' plugin. With the 'data' plugin I was exploring +> a different approach: try to keep the markup simple enough that the user can edit +> the markup directly, and still have that be ok. I admit it is a stretch, but I thought +> it worth exploring. + +Another use case is, after many such bugs have been filed, +wanting to add a new field to each bug report. To avoid needing to edit +every bug report it would be good if the fields in a bug report were +defined somewhere else, so that just that one place can be edited to add +the new field, and it will show up in each bug report (and in each bug +report's edit page, as a new form field). + +> If I was going to do that, I'd use a perl script on a checked out +> workspace. I think you're describing a rare operation and +> so I'd be happy not having a web interface for it. Having said that, +> if you just wanted to change the form for *new* pages, then you +> can just edit the template used to create new pages. + +Re the form plugin, I'm uncomfortable with tying things into +[[!cpan CGI::FormBuilder]] quite so tightly as you have. + +> Yeah :). But I wanted to explore the space and that was the +> easiest way to start. + +CGI::FormBuilder +could easily change in a way that broke whole wikis full of pages. Also, +needing to sanitize FormBuilder fields with security implications is asking +for trouble, since new FormBuilder features could add new fields, or +add new features to existing fields (FormBuilder is very DWIM) that open +new security holes. + +> There is a list of allowed fields. I only interpret those. + +I think that having a type system, that allows defining specific types, +like "email address", by writing code (that in turn can use FormBuilder), +is a better approach, since it should avoid becoming a security problem. + +> That would be possible. I think an extension to the 'data' plugin might +> work here. + +One specific security hole, BTW, is that if you allow the `validate` field, +FormBuilder will happily treat it as a regexp, and we don't want to expose +arbitrary perl regexps, since they can at least DOS a system, and can +probably be used to run arbitrary perl code. + +> I validate the validate field :). It only allows validate fields that match +> `/^[\w\s]+$/`. This means you can really only use the pre-defined +> validation types in FormBuilder. + +The data plugin only deals with a fairly small corner of the problem space, +but I think does a nice job at what it does. And could probably be useful +in a large number of other cases. + +> I think the data plugin is more likely to be useful than the form plugin. +> I was thinking of extending the data directive by allowing an 'id' parameter. +> When you have an id parameter, then you can display a small form for that +> data element. The submission handler would look through the page source +> for the data directive with the right id parameter and edit it. This would +> make the data directive more like the current 'form' plugin. + +> That is making things significantly more complex for less significant gain though. --[[Will]] + +> Oh, one quick other note. The data plugin below was designed to handle multiple +> data elements in a single directive. e.g. + + \[[!data key="Depends on" link="bugs/bugA" link="bugs/bugB" value=6]] + +> would match `data_eq(Depends on,6)`, `data_link(Depends on,bugs/bugA)`, `data_link(Depends on,bugs/bugB)` +> or, if you applied the patch in [[todo/tracking_bugs_with_dependencies]] then you can use 'defined pagespecs' +> such as `data_link(Depends on,~openBugs)`. The ability to label links like this allows separation of +> dependencies between bugs from arbitrary links. +>> This issue (the need for distinguished kinds of links) has also been brought up in other discussions: [[tracking_bugs_with_dependencies#another_kind_of_links]] (deps vs. links) and [[tag_pagespec_function]] (tags vs. links). --Ivan Z. + +>>> And multiple link types are now supported; plugins can set the link +>>> type when registering a link, and pagespec functions can match on them. --[[Joey]] + +---- + + #!/usr/bin/perl + # Interpret YAML data to make a web form + package IkiWiki::Plugin::form; + + use warnings; + use strict; + use CGI::FormBuilder; + use IkiWiki 2.00; + + sub import { + hook(type => "getsetup", id => "form", call => \&getsetup); + hook(type => "htmlize", id => "form", call => \&htmlize); + hook(type => "sessioncgi", id => "form", call => \&cgi_submit); + } + + sub getsetup () { + return + plugin => { + safe => 1, + rebuild => 1, # format plugin + }, + } + + sub makeFormFromYAML ($$$) { + my $page = shift; + my $YAMLString = shift; + my $q = shift; + + eval q{use YAML}; + error($@) if $@; + eval q{use CGI::FormBuilder}; + error($@) if $@; + + my ($dataHashRef) = YAML::Load($YAMLString); + + my @fields = keys %{ $dataHashRef->{fields} }; + + unshift(@fields, 'do'); + unshift(@fields, 'page'); + unshift(@fields, 'rcsinfo'); + + # print STDERR "Fields: @fields\n"; + + my $submittedPage; + + $submittedPage = $q->param('page') if defined $q; + + if (defined $q && defined $submittedPage && ! ($submittedPage eq $page)) { + error("Submitted page doensn't match current page: $page, $submittedPage"); + } + + error("Page not backed by file") unless defined $pagesources{$page}; + my $file = $pagesources{$page}; + + my $template; + + if (defined $dataHashRef->{template}) { + $template = $dataHashRef->{template}; + } else { + $template = "form.tmpl"; + } + + my $form = CGI::FormBuilder->new( + fields => \@fields, + charset => "utf-8", + method => 'POST', + required => [qw{page}], + params => $q, + action => $config{cgiurl}, + template => scalar IkiWiki::template_params($template), + wikiname => $config{wikiname}, + header => 0, + javascript => 0, + keepextras => 0, + title => $page, + ); + + $form->field(name => 'do', value => 'Update Form', required => 1, force => 1, type => 'hidden'); + $form->field(name => 'page', value => $page, required => 1, force => 1, type => 'hidden'); + $form->field(name => 'rcsinfo', value => IkiWiki::rcs_prepedit($file), required => 1, force => 0, type => 'hidden'); + + my %validkey; + foreach my $x (qw{label type multiple value fieldset growable message other required validate cleanopts columns comment disabled linebreaks class}) { + $validkey{$x} = 1; + } + + while ( my ($name, $data) = each(%{ $dataHashRef->{fields} }) ) { + next if $name eq 'page'; + next if $name eq 'rcsinfo'; + + while ( my ($key, $value) = each(%{ $data }) ) { + next unless $validkey{$key}; + next if $key eq 'validate' && !($value =~ /^[\w\s]+$/); + + # print STDERR "Adding to field $name: $key => $value\n"; + $form->field(name => $name, $key => $value); + } + } + + # IkiWiki::decode_form_utf8($form); + + return $form; + } + + sub htmlize (@) { + my %params=@_; + my $content = $params{content}; + my $page = $params{page}; + + my $form = makeFormFromYAML($page, $content, undef); + + return $form->render(submit => 'Update Form'); + } + + sub cgi_submit ($$) { + my $q=shift; + my $session=shift; + + my $do=$q->param('do'); + return unless $do eq 'Update Form'; + IkiWiki::decode_cgi_utf8($q); + + eval q{use YAML}; + error($@) if $@; + eval q{use CGI::FormBuilder}; + error($@) if $@; + + my $page = $q->param('page'); + + return unless exists $pagesources{$page}; + + return unless $pagesources{$page} =~ m/\.form$/ ; + + return unless IkiWiki::check_canedit($page, $q, $session); + + my $file = $pagesources{$page}; + my $YAMLString = readfile(IkiWiki::srcfile($file)); + my $form = makeFormFromYAML($page, $YAMLString, $q); + + my ($dataHashRef) = YAML::Load($YAMLString); + + if ($form->submitted eq 'Update Form' && $form->validate) { + + #first update our data structure + + while ( my ($name, $data) = each(%{ $dataHashRef->{fields} }) ) { + next if $name eq 'page'; + next if $name eq 'rcs-data'; + + if (defined $q->param($name)) { + $data->{value} = $q->param($name); + } + } + + # now write / commit the data + + writefile($file, $config{srcdir}, YAML::Dump($dataHashRef)); + + my $message = "Web form submission"; + + IkiWiki::disable_commit_hook(); + my $conflict=IkiWiki::rcs_commit($file, $message, + $form->field("rcsinfo"), + $session->param("name"), $ENV{REMOTE_ADDR}); + IkiWiki::enable_commit_hook(); + IkiWiki::rcs_update(); + + require IkiWiki::Render; + IkiWiki::refresh(); + + IkiWiki::redirect($q, "$config{url}/".htmlpage($page)."?updated"); + + } else { + error("Invalid data!"); + } + + exit; + } + + package IkiWiki::PageSpec; + + sub match_form_eq ($$;@) { + my $page=shift; + my $argSet=shift; + my @args=split(/,/, $argSet); + my $field=shift @args; + my $value=shift @args; + + my $file = $IkiWiki::pagesources{$page}; + + if ($file !~ m/\.form$/) { + return IkiWiki::FailReason->new("page is not a form"); + } + + my $YAMLString = IkiWiki::readfile(IkiWiki::srcfile($file)); + + eval q{use YAML}; + error($@) if $@; + + my ($dataHashRef) = YAML::Load($YAMLString); + + if (! defined $dataHashRef->{fields}->{$field}) { + return IkiWiki::FailReason->new("field '$field' not defined in page"); + } + + my $formVal = $dataHashRef->{fields}->{$field}->{value}; + + if ($formVal eq $value) { + return IkiWiki::SuccessReason->new("field value matches"); + } else { + return IkiWiki::FailReason->new("field value does not match"); + } + } + + 1 + +---- + + #!/usr/bin/perl + # Allow data embedded in a page to be checked for + package IkiWiki::Plugin::data; + + use warnings; + use strict; + use IkiWiki 2.00; + + my $inTable = 0; + + sub import { + hook(type => "getsetup", id => "data", call => \&getsetup); + hook(type => "needsbuild", id => "data", call => \&needsbuild); + hook(type => "preprocess", id => "data", call => \&preprocess, scan => 1); + hook(type => "preprocess", id => "datatable", call => \&preprocess_table, scan => 1); # does this need scan? + } + + sub getsetup () { + return + plugin => { + safe => 1, + rebuild => 1, # format plugin + }, + } + + sub needsbuild (@) { + my $needsbuild=shift; + foreach my $page (keys %pagestate) { + if (exists $pagestate{$page}{data}) { + if (exists $pagesources{$page} && + grep { $_ eq $pagesources{$page} } @$needsbuild) { + # remove state, it will be re-added + # if the preprocessor directive is still + # there during the rebuild + delete $pagestate{$page}{data}; + } + } + } + } + + sub preprocess (@) { + my @argslist = @_; + my %params=@argslist; + + my $html = ''; + my $class = defined $params{class} + ? 'class="'.$params{class}.'"' + : ''; + + if ($inTable) { + $html = "