Skip to content

Commit dd06810

Browse files
committed
Add bytes_to_utf8_temp_pv()
This is like bytes_to_utf8_free_me, but any new memory is arranged to be freed at the end of the current pseudo block via SAVEFREEPV. This adds the one missing function that are inverse to the utf8_to_bytes_foo() ones.
1 parent e7a6a02 commit dd06810

File tree

6 files changed

+40
-1
lines changed

6 files changed

+40
-1
lines changed

embed.fnc

+3
Original file line numberDiff line numberDiff line change
@@ -800,6 +800,9 @@ Adp |U8 * |bytes_to_utf8_free_me \
800800
|NN const U8 *s \
801801
|NN STRLEN *lenp \
802802
|NULLOK void **free_me
803+
Adip |U8 * |bytes_to_utf8_temp_pv \
804+
|NN const U8 *s \
805+
|NN STRLEN *lenp
803806
AOdp |SSize_t|call_argv |NN const char *sub_name \
804807
|I32 flags \
805808
|NN char **argv

embed.h

+1
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,7 @@
157157
# define bytes_from_utf8(a,b,c) Perl_bytes_from_utf8(aTHX_ a,b,c)
158158
# define bytes_to_utf8(a,b) Perl_bytes_to_utf8(aTHX_ a,b)
159159
# define bytes_to_utf8_free_me(a,b,c) Perl_bytes_to_utf8_free_me(aTHX_ a,b,c)
160+
# define bytes_to_utf8_temp_pv(a,b) Perl_bytes_to_utf8_temp_pv(aTHX_ a,b)
160161
# define c9strict_utf8_to_uv Perl_c9strict_utf8_to_uv
161162
# define call_argv(a,b,c) Perl_call_argv(aTHX_ a,b,c)
162163
# define call_atexit(a,b) Perl_call_atexit(aTHX_ a,b)

inline.h

+13
Original file line numberDiff line numberDiff line change
@@ -1236,6 +1236,19 @@ Perl_bytes_to_utf8(pTHX_ const U8 *s, STRLEN *lenp)
12361236
return bytes_to_utf8_free_me(s, lenp, NULL);
12371237
}
12381238

1239+
PERL_STATIC_INLINE U8 *
1240+
Perl_bytes_to_utf8_temp_pv(pTHX_ const U8 *s, STRLEN *lenp)
1241+
{
1242+
void * free_me = NULL;
1243+
U8 * converted = bytes_to_utf8_free_me(s, lenp, &free_me);
1244+
1245+
if (free_me) {
1246+
SAVEFREEPV(free_me);
1247+
}
1248+
1249+
return converted;
1250+
}
1251+
12391252
PERL_STATIC_INLINE bool
12401253
Perl_utf8_to_bytes_new_pv(pTHX_ U8 const **s_ptr, STRLEN *lenp, void ** free_me)
12411254
{

pod/perldelta.pod

+10-1
Original file line numberDiff line numberDiff line change
@@ -360,7 +360,16 @@ well.
360360

361361
=item *
362362

363-
XXX
363+
Two new API functions are introduced to convert strings encoded in
364+
native bytes format to UTF-8. These return the string unchanged if its
365+
UTF-8 representation is the same as the original. Otherwise, new memory
366+
is allocated to contain the converted string. This is in contrast to
367+
the existing L<perlapi/C<bytes_to_utf8>> which always allocates new
368+
memory. The new functions are L<perlapi/C<bytes_to_utf8_free_me>> and
369+
L<perlapi/C<bytes_to_utf8_temp_pv>>.
370+
L<perlapi/C<bytes_to_utf8_temp_pv>> arranges for the new memory to
371+
automatically be freed. With C<bytes_to_utf8_free_me>, you are
372+
responsible for freeing any newly allocated memory.
364373

365374
=back
366375

proto.h

+5
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

utf8.c

+8
Original file line numberDiff line numberDiff line change
@@ -3256,6 +3256,7 @@ Perl_bytes_from_utf8(pTHX_ const U8 *s, STRLEN *lenp, bool *is_utf8p)
32563256
/*
32573257
=for apidoc bytes_to_utf8
32583258
=for apidoc_item bytes_to_utf8_free_me
3259+
=for apidoc_item bytes_to_utf8_temp_pv
32593260
32603261
These each convert a string C<s> of length C<*lenp> bytes from the native
32613262
encoding into UTF-8 (UTF-EBCDIC on EBCDIC platforms), returning a pointer to
@@ -3275,6 +3276,13 @@ already there.
32753276
In both cases, the caller is responsible for arranging for any new memory to
32763277
get freed.
32773278
3279+
C<bytes_to_utf8_temp_pv> simply returns a pointer to the input string if the
3280+
string's UTF-8 representation is the same as its native representation, thus
3281+
behaving like C<bytes_to_utf8_free_me> in this situation. Otherwise, it
3282+
behaves like C<bytes_to_utf8>, returning a pointer to new memory containing the
3283+
conversion of the input. The difference is that it also arranges for the new
3284+
memory to automatically be freed by calling C<L</SAVEFREEPV>> on it.
3285+
32783286
C<bytes_to_utf8_free_me> takes an extra parameter, C<free_me> to communicate.
32793287
to the caller that memory was allocated or not. If that parameter is NULL,
32803288
C<bytes_to_utf8_free_me> acts identically to C<bytes_to_utf8>, always

0 commit comments

Comments
 (0)