1 package Convert::YText;
5 use vars qw/$VERSION @ISA @EXPORT_OK/;
7 @EXPORT_OK = qw( encode_ytext decode_ytext );
16 Convert::YText - Quotes strings suitably for rfc2822 local part
24 use Convert::YText qw(encode_ytext decode_ytext);
26 $encoded=encode_ytext($string);
27 $decoded=decode_ytext($encoded);
29 ($decoded eq $string) || die "this should never happen!";
34 Convert::YText converts strings to and from "YText", a format inspired
35 by xtext defined in RFC1894, the MIME base64 and quoted-printable
36 types (RFC 1394). The main goal is encode a UTF8 string into something safe
37 for use as the local part in an internet email address (RFC2822).
39 According to RFC 2822, the following non-alphanumerics are OK for the
40 local part of an address: "!#$%&'*+-/=?^_`{|}~". On the other hand, it
41 seems common in practice to block addresses having "%!/|`#&?" in the
42 local part. The idea is to restrict ourselves to basic ASCII
43 alphanumerics, plus a small set of printable ASCII, namely "=_+-~.".
44 Spaces are replaced with "_", "/" with "~", the characters
45 "A-Za-z0-9.\+\-~" encode as themselves, and everything else is written
46 "=USTR=" where USTR is the base64 (using "A-Za-z0-9\+\-\." as digits)
47 encoding of the unicode character code.
49 The characters '+' and '-' are pretty widely used to attach suffixes
50 (although usually only one works on a given mail host). It seems ok to
51 use '+-', since the first marks the beginning of a suffix, and then is
52 a regular character. The character '.' also seems mostly permissable.
59 our $digit_string="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-.";
61 our $valid_rex=qr{[A-Za-z0-9\+\-\.\=\_\~]+};
63 our @digits=split "",$digit_string;
70 my $remainder=$num % 64;
73 $str = $digits[$remainder].$str;
80 my @chars=split "",$str;
83 while (scalar(@chars)>0){
84 my $remainder=index $digit_string,$chars[0];
86 # convert this to carp or something
87 die if ($remainder <0);
91 print STDERR "num=$num\n";
99 # "=" we use as an escape, and '_' for space
100 $str=~ s/([^a-zA-Z0-9+\-\/. ])/"=".encode_num(ord($1))."="/ge;
110 $str=~ s/=([a-zA-Z0-9+\-\.])+=/ decode_str($1)/eg;
118 Finish doc. Write tests.
122 David Bremner, E<lt>bremner@unb.caE<gt>
126 Copyright (C) 2008 David Bremner. All Rights Reserved.
128 This module is free software; you can redistribute it and/or modify it
129 under the same terms as Perl itself.
133 This module is currently in B<BETA> condition. It should not be used
134 in a production environment, and is released with no warranty of any
137 Corrections, suggestions, bugreports and tests are welcome!
141 L<MIME::Base64>, L<MIME::Decoder::Base64>, L<MIME::Decoder::QuotedPrint>.