Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid generating malformed UTF-8 and replacement characters by inter… #160

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,7 @@ t/900_bugs/042_perl59_issue.t
t/900_bugs/043_issue107.t
t/900_bugs/044_empty_result.t
t/900_bugs/045_issue130.t
t/900_bugs/046_issue88.t
t/900_bugs/issue79/tmpl/contentA.tt
t/900_bugs/issue79/tmpl/contentB.tt
t/900_bugs/issue79/tmpl/wrapperA.tt
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ to read the documentation online with your favorite pager.
# RESOURCE

web site: http://xslate.org/
repositories: http://github.com/xslate
issue tracking: http://github.com/xslate/issues
repositories: https://github.com/xslate
issue tracking: https://github.com/xslate/p5-Text-Xslate/issues

# LICENSE AND COPYRIGHT

Expand Down
2 changes: 1 addition & 1 deletion author/t_renumber.pl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

my $dir = shift(@ARGV) or die "Usage: $0 test-dir\n";
$dir =~ s{/$}{};
-d $dir or die "No such directry: $dir\n";
-d $dir or die "No such directory: $dir\n";

my $i = 0;
foreach my $dir (sort { ($a =~ /(\d+)_\w+\.t$/)[0] <=> ($b =~ /(\d+)_\w+\.t$/)[0] } <$dir/*.t>) {
Expand Down
6 changes: 3 additions & 3 deletions lib/Text/Xslate.pm
Original file line number Diff line number Diff line change
Expand Up @@ -1233,13 +1233,13 @@ A "too-safe" HTML escaping filter which escape all the symbolic characters
WEB: L<http://xslate.org/>
PROJECT HOME: L<http://github.com/xslate/>
PROJECT HOME: L<https://github.com/xslate/>
REPOSITORY: L<http://github.com/xslate/p5-Text-Xslate/>
REPOSITORY: L<https://github.com/xslate/p5-Text-Xslate/>
=head1 BUGS
Please make a file on L<http://github.com/xslate/>. Patches are always welcome.
Please report issues at L<https://github.com/xslate/p5-Text-Xslate/issues>. Patches are always welcome.
=head1 SEE ALSO
Documents:
Expand Down
2 changes: 1 addition & 1 deletion lib/Text/Xslate/Bridge/Star.pm
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ See L<perldoc/sprintf> for details.

=head2 C<rx($regex_pattern)>

Compiles I<$regex_patter> as a regular expression and return the regex object. You can pass a regex object to C<match()> or C<replace()> described bellow.
Compiles I<$regex_patter> as a regular expression and return the regex object. You can pass a regex object to C<match()> or C<replace()> described below.
The same as C<qr//> operator in Perl.

=head2 C<match($str, $pattern)>
Expand Down
2 changes: 1 addition & 1 deletion lib/Text/Xslate/Manual.pod
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Xslate manual pages are not yet completed. Patches are welcome.

L<http://xslate.org> - The Xslate web site

L<http://github.com/xslate> - Xslate repositories
L<https://github.com/xslate> - Xslate repositories

=cut

4 changes: 2 additions & 2 deletions lib/Text/Xslate/Manual/FAQ.pod
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ C<malloc()> is to pre-allocate the output buffer in an intelligent manner:
For example, Text::Xslate assumes that most templates will be rendered to be
about the same as the previous run, so when a template is rendered it uses
the size allocated for the previous rendering as an approximation of how much
space the current rendering will require. This allows to greatly reduce the
space the current rendering will require. This allows you to greatly reduce the
number of C<malloc()> calls required to render a template.

=back
Expand Down Expand Up @@ -196,7 +196,7 @@ It is unlikely to need to write plugins for Xslate, because Xslate allows
you to export any functions to templates. Any function-based modules
are available by the C<module> option.

Xslate also allows to call methods for object instances, so you can
Xslate also allows you to call methods for object instances, so you can
use any object-oriented modules, except for classes which only provide
class methods (they need wrappers).

Expand Down
4 changes: 2 additions & 2 deletions lib/Text/Xslate/PP/State.pm
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ sub print {
if(defined ${$sv}) {
$st->{output} .=
(utf8::is_utf8($st->{output}) && !utf8::is_utf8(${$sv}))
? $st->encoding->decode(${$sv})
? eval {$st->encoding->decode(${$sv}, Encode::FB_CROAK())} || ${$sv}
: ${$sv};
}
else {
Expand All @@ -188,7 +188,7 @@ sub print {
$sv =~ s/($Text::Xslate::PP::html_metachars)/$Text::Xslate::PP::html_escape{$1}/xmsgeo;
$st->{output} .=
(utf8::is_utf8($st->{output}) && !utf8::is_utf8($sv))
? $st->encoding->decode($sv)
? eval {$st->encoding->decode($sv, Encode::FB_CROAK())} || $sv
: $sv;
}
else {
Expand Down
2 changes: 1 addition & 1 deletion lib/Text/Xslate/Runner.pm
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ has suffix => (
);

has dest => (
documentation => 'Destination directry',
documentation => 'Destination directory',
cmd_aliases => [qw(o)],
is => 'ro',
isa => 'Str', # Maybe[Str]
Expand Down
6 changes: 3 additions & 3 deletions script/xslate
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ xslate - Process Xslate Templates
--ie --input_encoding Input encoding (default: UTF-8)
-i --ignore Regular expression the process will ignore
-c --cache_dir Directory the cache files will be saved in
-o --dest Destination directry
-o --dest Destination directory
-w --verbose Warning level (default: 2)

# one liners, with $ARGV and $ENV
Expand All @@ -42,7 +42,7 @@ xslate - Process Xslate Templates

=head1 DESCRIPTION

The xslate script is used to process entire directory trees containing
The C<xslate> script is used to process entire directory trees containing
template files, or to process one liners.

=head1 ARGUMENTS
Expand All @@ -51,7 +51,7 @@ template files, or to process one liners.

Specifies the target to be processed by Xslate.

If the target is a file, the file is processed, and xslate will exit immediately. If the target is a directory, then the directory is traversed and each file found is processed via xslate.
If the target is a file, the file is processed, and C<xslate> will exit immediately. If the target is a directory, then the directory is traversed and each file found is processed via C<xslate>.

=head1 AUTHOR

Expand Down
32 changes: 29 additions & 3 deletions src/Text-Xslate.xs
Original file line number Diff line number Diff line change
Expand Up @@ -541,19 +541,39 @@ tx_unmark_raw(pTHX_ SV* const str) {
/* does sv_catsv_nomg(dest, src), but significantly faster */
STATIC_INLINE void
tx_sv_cat(pTHX_ SV* const dest, SV* const src) {
STRLEN len;
const char* pv = SvPV_const(src, len);

if(!SvUTF8(dest) && SvUTF8(src)) {
sv_utf8_upgrade(dest);
}

{
STRLEN len;
const char* const pv = SvPV_const(src, len);
if(SvUTF8(dest) == SvUTF8(src)
|| is_utf8_string((const U8 *)pv, len)) {
STRLEN const dest_cur = SvCUR(dest);
char* const d = SvGROW(dest, dest_cur + len + 1 /* count '\0' */);

SvCUR_set(dest, dest_cur + len);
Copy(pv, d + dest_cur, len + 1 /* copy '\0' */, char);
}
else {
STRLEN const dest_cur = SvCUR(dest);
/* Longest UTF-8 representation of each char is 2 octets. */
char* const d_start = SvGROW(dest, dest_cur + 2 * len + 1 /* count '\0' */);
char* d = d_start + dest_cur;

while(len--) {
const U8 c = *pv++;
if (UTF8_IS_INVARIANT(c)) {
*(d++) = c;
} else {
*(d++) = UTF8_EIGHT_BIT_HI(c);
*(d++) = UTF8_EIGHT_BIT_LO(c);
}
}
*d = '\0';
SvCUR_set(dest, d - d_start);
}
}

static void /* doesn't care about raw-ness */
Expand All @@ -563,6 +583,8 @@ tx_sv_cat_with_html_escape_force(pTHX_ SV* const dest, SV* const src) {
const char* const end = cur + len;
STRLEN const dest_cur = SvCUR(dest);
char* d;
const U32 upgrade_on_copy = SvUTF8(dest) && !SvUTF8(src)
&& !is_utf8_string((const U8 *)cur, len);

(void)SvGROW(dest, dest_cur + ( len * ( sizeof("&quot;") - 1) ) + 1);
if(!SvUTF8(dest) && SvUTF8(src)) {
Expand Down Expand Up @@ -595,6 +617,10 @@ tx_sv_cat_with_html_escape_force(pTHX_ SV* const dest, SV* const src) {
// CopyToken("&apos;", d);
CopyToken("&#39;", d);
}
else if (upgrade_on_copy && !UTF8_IS_INVARIANT(c)) {
*(d++) = UTF8_EIGHT_BIT_HI((U8) c);
*(d++) = UTF8_EIGHT_BIT_LO((U8) c);
}
else {
*(d++) = c;
}
Expand Down
2 changes: 1 addition & 1 deletion t/200_app/001_hello.t
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ system $^X, (map { "-I$_" } @INC), "script/xslate",

is $?, 0, "command executed successfully (1)";

ok -d CACHE_DIR, 'cache directry created';
ok -d CACHE_DIR, 'cache directory created';

ok -f sprintf('%s/out/hello.txt', $Bin), 'correct file generated';

Expand Down
23 changes: 23 additions & 0 deletions t/900_bugs/046_issue88.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!perl
# https://github.com/xslate/p5-Text-Xslate/issues/88
use strict;
use warnings;
use Test::More;

use utf8;
use Text::Xslate 'mark_raw';
my $xslate = Text::Xslate->new();

is $xslate->render_string('<: $string :>', {string => "Ä"}) => 'Ä';
is $xslate->render_string('<: $string :>', {string => "\x{c4}"}) => 'Ä';

is $xslate->render_string('あ<: $string :>', {string => "Ä"}) => 'あÄ';
is $xslate->render_string('あ<: $string :>', {string => "\x{c4}"}) => 'あÄ';

is $xslate->render_string('<: $string :>', {string => mark_raw("Ä")}) => 'Ä';
is $xslate->render_string('<: $string :>', {string => mark_raw("\x{c4}")}) => 'Ä';

is $xslate->render_string('あ<: $string :>', {string => mark_raw("Ä")}) => 'あÄ';
is $xslate->render_string('あ<: $string :>', {string => mark_raw("\x{c4}")}) => 'あÄ';

done_testing();