X-Git-Url: http://git.vanrenterghem.biz/git.ikiwiki.info.git/blobdiff_plain/75bb0df5280deb28107c5e4a4b2f58f59c27ef7c..cf1290eb464f1256aa5c12d973fff774e4f83e5e:/doc/todo/commandline_comment_moderation.mdwn diff --git a/doc/todo/commandline_comment_moderation.mdwn b/doc/todo/commandline_comment_moderation.mdwn index 8cce512ee..a72fb31b2 100644 --- a/doc/todo/commandline_comment_moderation.mdwn +++ b/doc/todo/commandline_comment_moderation.mdwn @@ -1,5 +1,105 @@ -So I have enabled the [[moderatedcomments]] plugin on my wiki. and good thing that! around 1000 spammy comments showed up in the last 3 months! Awful! +So I have enabled the [[plugins/moderatedcomments]] plugin on my wiki. and good thing that! around 1000 spammy comments showed up in the last 3 months! Awful! It's pretty hard to figure out the ham and the spam in there. One thing I was hoping was to use the power of the commandline to filter through all that stuff. Now, it seems there's only a "ikiwiki-comment" tool now, and nothing to examine the moderated comments. -It seems to me it would be great to have some tool to filter through that... --[[anarcat]] +It seems to me it would be great to have some tool to filter through that... + + +So it turns out it was over 3000 comments. The vast majority of those (every one but 42 comments) were from the IP `46.161.41.34` which i recommend null-routing everywhere. I used the following shell magic to figure that out: + +
+#!/bin/sh
+
+set -e
+
+cd .ikiwiki/transient || {
+    echo could not find comments, make sure you are in a ikiwiki source directory.
+    exit 1
+    }
+# count the number of comments
+echo found $(find . -name '*._comment_pending' | wc -l) pending comments
+# number of comments per IP
+echo IP distribution:
+find . -name '*._comment_pending' | xargs grep -h ip= | sort | uniq -c | sort -n
+# generate a banlist for insertion in `banusers`, assuming all the
+# pending comments are spam
+echo banlist would look like:
+find . -name '*._comment_pending' | xargs grep -h ip= | sort -u| sed 's/ ip="//;s/"//;s/^/- ip(/;s/$/)/'
+
+echo to remove comments from a specific IP, use one of those:
+find . -name '*._comment_pending' | xargs grep -h ip= | sort -u \
+    | sed 's/ ip="//;s/"//;' \
+    | while read ip; do
+          echo "find . -name '*._comment_pending' | xargs grep -l 'ip=\"$ip\"'| xargs rm"
+      done
+echo to flush all pending comments, use:
+echo "find . -name '*._comment_pending' -delete"
+
+ +The remaining 42 comments I reviewed throught the web interface, then flushed using the above command. My final addition to the banlist is: + +
+- ip(159.224.160.225)
+- ip(176.10.104.227)
+- ip(176.10.104.234)
+- ip(188.143.233.211)
+- ip(193.201.227.41)
+- ip(195.154.181.152)
+- ip(213.238.175.29)
+- ip(31.184.238.11)
+- ip(37.57.231.112)
+- ip(37.57.231.204)
+- ip(46.161.41.34)
+- ip(46.161.41.199)
+- ip(95.130.13.111)
+- ip(95.181.178.142)
+
+ + --[[anarcat]] + +Update: i made a script, above. And the banlist is much larger now so the above list is pretty much out of date now... --[[anarcat]] + +Another update, 2020: I rewrote the script to support interactive batch approval and running from a cron job. I have published the [script in my own repo](https://gitlab.com/anarcat/scripts/blob/master/ikiwiki-comment-moderation) since it's python (and not perl), but I would be happy to provide it as a patch here if that's acceptable. + +The basic usage is as follows. First, you add the script in a cron job: + + 9 * * * * /home/anarcat/bin/ikiwiki-comment-moderation --source-dir=$HOME/source/ + +This will run every hour. When there are no comments to moderate, the script is silent and you will not get mail. Otherwise you will get something like this: + + date ip user subject content + 2020-05-27T03:42:05Z 192.168.0.116 spammer name subject spammy comment + 1 comments pending moderation + +Then you can either go through the web interface to approve/deny the +comments, or call the script by hand, interactively, for example: + + w-anarcat@marcos:~/source$ ~anarcat/bin/ikiwiki-comment-moderation --source-dir=$HOME/source -i --verbose + Date : 2020-05-27T04:00:23Z + Ip : 192.168.0.116 + Claimedauthor : spammer name + Subject : subject + Content : spammy comment + approve, delete, ignore? [a/d/I] a + moving /home/w-anarcat/source/.ikiwiki/transient/blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment_pending to /home/w-anarcat/source/blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment + adding to git... + [master 8f5cb10f] approve comment + 1 file changed, 9 insertions(+) + create mode 100644 blog/2020-04-27-drowning-camera/comment_1_07f43231a14d0ee6e78d1030aa6a7985._comment + Énumération des objets: 8, fait. + Décompte des objets: 100% (8/8), fait. + Compression par delta en utilisant jusqu'à 12 fils d'exécution + Compression des objets: 100% (5/5), fait. + Écriture des objets: 100% (5/5), 566 bytes | 566.00 KiB/s, fait. + Total 5 (delta 3), réutilisés 0 (delta 0) + To /home/w-anarcat/source.git + 91669038..8f5cb10f master -> master + approved comment + +And you're done! In the above case, the test comment was actually +approved (by pressing `a`), but you can also hit `d` to just delete +the comment. The default (`i`) is to ignore the comment. + +I find that this is generally faster than going through a web browser, although to be as fast as the CGI interface, there would need to be a final dialog that says "delete all ignored comments" like in the CGI. Exercise for the reader or, I guess, myself when I got too many junk comments to process... + +Feedback, as usual, welcome. -- [[anarcat]]