Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add compact vararg impl gv_stashsvpvn_cached_p, add gv_stashhek #23041

Open
wants to merge 1 commit into
base: blead
Choose a base branch
from

Conversation

bulk88
Copy link
Contributor

@bulk88 bulk88 commented Feb 27, 2025

All the code in gv.c, is very old and has gotten zero optimizing since 5.000 alpha. SVs are instantly turned into PVNs on the front end instantly loosing and chance of [future] SVPV COW Shared HEK key string optimization. HEKs are unknown to gv_* API. All inputs are continously parsed for ' and :: without exception, even if they are read only (SEGV) C literals or PP SvREADONLY() SvPROTECT() read only literals or API contract read only HEK* PV buffers. Returned from hv_store*() hv_fetch*(), HEs, aren't exploited to pass the shared HEK onto gv_init_() or gv_name_set(), and gv_name_set() on front end only understands PVNs, but on backend, in the GP struct and GV body struct, ONLY understands HEKs. Therefore no RC++, and looking up the ShHEK again in PL_strtab.

The large amount of tiny extern exported symbols wrapper funs added over the years also causes C dbg call stacks even at -O1/-O2, to be 2-5 call frames deep of 3 line shims/stub functions before reaching the main logic. I can't tell what is a mathom and what isn't.

So to lay provisions needed for future commits, that add proper SV*/HEK*/U32 hash precalculation, not to mention the memcmp() in hv_common() is skipped if left and right ptr addr are equal. The front end of gv_* needs cleanup.

-move U32 flags to the start of the the func, so flags can encode details what void * #1 means, and if vararg void * #2 exists (PVN with N as size_t is only 2nd arg user right now). Since gv_stashpvs() is very common on core and CPAN, and called over and over in 1 proc, since most interp core and CPAN XS devs don't know GVs have an RC that can be ++ed and stored in a MY_CXT struct. Also nobody knows "stashes" are HVs or PP packages/classes are implimented with HV*s. So there is reason to pay extra attention to gv_stashpvs() b/c of its high usage/call sites per library. So if the STRLEN can be CC constant folded, and fits in a U8, store the length in the flags arg. Saves on CPU ops in all the callers to push 2 args, vs 3. Public API gv_stashpvs(str, create)'s create arg [flags in reality] can't be optimized away or removed, so combine the 2 CC time constant args, so they fold/optimize into 1 cpu op.

-at some point perl core needs to cache/create/move around C level arrays of RC++ed ShHEKs to pass to the gv_*() APIs. SVPVs aren't exactly the right format for storing sanitized (no */::/'/SUPER/main/UNIVERSAL) and pre-parsed/splitted "package tokens", since SVs easily wind up or escape into PP-state, and SV RO flags/COW flags aren't the most honored and respected parts of the API by CPAN XS/maybe core.

ShHEKs escaping into PP-state is rarer than "generic SVs" escaping into PP-state or CPAN XS state. All legacy XS code any quality and entry/beginner XS people, will pick "char " getter macros vs an unknown opaque "HEK" type (and newSVpvn() to capture/move those char s). Users who know what a HEK is and how to RC++ it, know not to write to it. Also a bad write to a ShHEK will cause more PP or SEGV breakage/panics or proc exits, alot faster than a bad write to a SVfRO "SVPV" buffer. Hash doesn't match char string in a ShHEK will term the prc faster. So vararg on gv_() is a provision for a future prototype, that accepts 1, 2, 3 or more HEK*s passed array style, that already were sanitized to not have ::s.

0xFF length was picked b/c there was bitfield space, shaving to 32/64/128 chars for gv_stashpvs(str, create) is possible if the bits are needed b/c a terminal is 80 chars, would fit almost all absolute ("::") C string package names, and everthing in core and CPAN.

-the stubs remain as exported stub funcs, on purpose for now, it makes
certain diag tools I use slightly easier to use vs optimized out inlines
or macros. In 5.43 or 5.45 the exported stub funcs can be converted to
macros no static inline, which is intent of this commit. The vararg
is the 1 and only entry point to all of gv_stash* logic.

-flipping I32 flags to the front requires "_p" suffixes for private for
ABI reasons, public API still thinks I32 flags is always the last arg
-since all front end wrappers, are 1-away from instead of multiple frames
away, they are more likely to LTO inline away inside of libperl (not XS)
on any CC. CCs have cost/benefit/wall time cut offs for scoring
potential inlines opportunities. Going 2 layers, or 3+ layers of small
inlines, is asking alot from a CC, that has to traverse a tree of nodes
to do each inline, and the cut off could be as low as 1 inline fn and no
more unrolling or folding.


  • This set of changes does not require a perldelta entry.
  • no external xs api changes

All the code in gv.c, is very old and has gotten zero optimizing since
5.000 alpha.  SV*s are instantly turned into PVNs on the front end
instantly loosing and chance of [future] SVPV COW Shared HEK key
string optimization.  HEK*s are unknown to gv_* API. All inputs are
continously parsed for ' and ::  without exception, even if they are
read only (SEGV) C literals or PP SvREADONLY() SvPROTECT() read only
literals or API contract read only HEK* PV buffers. Returned from
hv_store*() hv_fetch*(), HE*s, aren't exploited to pass the shared HEK*
onto gv_init_*() or gv_name_set(), and gv_name_set() on front end only
understands PVNs, but on backend, in the GP struct and GV body struct,
ONLY understands HEK*s. Therefore no RC++, and looking up the ShHEK again
in PL_strtab.

The large amount of tiny extern exported symbols wrapper funs added over
the years also causes C dbg call stacks even at -O1/-O2, to be 2-5 call
frames deep of 3 line shims/stub functions before reaching the main
logic. I can't tell what is a mathom and what isn't.

So to lay provisions needed for future commits, that add proper
SV*/HEK*/U32 hash precalculation, not to mention the memcmp() in
hv_common() is skipped if left and right ptr addr are equal. The front
end of gv_* needs cleanup.

-move U32 flags to the start of the the func, so flags can encode details
what void * Perl#1 means, and if vararg void * Perl#2 exists
(PVN with N as size_t is only 2nd arg user right now). Since
gv_stashpvs() is very common on core and CPAN, and called over and over
in 1 proc, since most interp core and CPAN XS devs don't know GV*s have
an RC that can be ++ed and stored in a MY_CXT struct. Also nobody knows
"stashes" are HV*s or PP packages/classes are implimented with HV*s.
So there is reason to pay extra attention to gv_stashpvs() b/c of its
high usage/call sites per library. So if the STRLEN can be CC constant
folded, and fits in a U8, store the length in the flags arg. Saves on
CPU ops in all the callers to push 2 args, vs 3. Public API
gv_stashpvs(str, create)'s create arg [flags in reality] can't be
optimized away or removed, so combine the 2 CC time constant args, so
they fold/optimize into 1 cpu op.

-at some point perl core needs to cache/create/move around C level
arrays of RC++ed ShHEKs to pass to the gv_*() APIs. SVPVs aren't exactly
the right format for storing sanitized (no */::/'/SUPER/main/UNIVERSAL)
and pre-parsed/splitted "package tokens", since SVs easily wind up or
escape into PP-state, and SV RO flags/COW flags aren't the most honored
and respected parts of the API by CPAN XS/maybe core.

ShHEKs escaping into PP-state is rarer than "generic SVs" escaping into
PP-state or CPAN XS state.  All legacy XS code any quality and
entry/beginner XS people, will pick "char *" getter macros vs an unknown
opaque "HEK" type (and newSVpvn() to capture/move those char *s).
Users who know what a HEK* is and how to RC++ it, know not to write to it.
Also a bad write to a ShHEK will cause more PP or SEGV breakage/panics
or proc exits, alot faster than a bad write to a SVfRO  "SVPV" buffer.
Hash doesn't match char string in a ShHEK will term the prc faster.
So vararg on gv_*() is a provision for a future prototype, that accepts
1, 2, 3 or more HEK*s passed array style, that already were sanitized
to not have ::s.

0xFF length was picked b/c there was bitfield space, shaving to
32/64/128 chars for gv_stashpvs(str, create) is possible if the bits
are needed b/c a terminal is 80 chars, would fit almost all absolute
("::") C string package names, and everthing in core and CPAN.

-the stubs remain as exported stub funcs, on purpose for now, it makes
 certain diag tools I use slightly easier to use vs optimized out inlines
 or macros. In 5.43 or 5.45 the exported stub funcs can be converted to
 macros no static inline, which is intent of this commit. The vararg
 is the 1 and only entry point to all of gv_stash* logic.

-flipping I32 flags to the front requires "_p" suffixes for private for
 ABI reasons, public API still thinks I32 flags is always the last arg
-since all front end wrappers, are 1-away from instead of multiple frames
 away, they are more likely to LTO inline away inside of libperl (not XS)
 on any CC. CCs have cost/benefit/wall time cut offs for scoring
 potential inlines opportunities. Going 2 layers, or 3+ layers of small
 inlines, is asking alot from a CC, that has to traverse a tree of nodes
 to do each inline, and the cut off could be as low as 1 inline fn and no
 more unrolling or folding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant