-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC C static shared string HEKPOOL API #23042
Open
bulk88
wants to merge
2
commits into
Perl:blead
Choose a base branch
from
bulk88:hekpool_POC
base: blead
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
All the code in gv.c, is very old and has gotten zero optimizing since 5.000 alpha. SV*s are instantly turned into PVNs on the front end instantly loosing and chance of [future] SVPV COW Shared HEK key string optimization. HEK*s are unknown to gv_* API. All inputs are continously parsed for ' and :: without exception, even if they are read only (SEGV) C literals or PP SvREADONLY() SvPROTECT() read only literals or API contract read only HEK* PV buffers. Returned from hv_store*() hv_fetch*(), HE*s, aren't exploited to pass the shared HEK* onto gv_init_*() or gv_name_set(), and gv_name_set() on front end only understands PVNs, but on backend, in the GP struct and GV body struct, ONLY understands HEK*s. Therefore no RC++, and looking up the ShHEK again in PL_strtab. The large amount of tiny extern exported symbols wrapper funs added over the years also causes C dbg call stacks even at -O1/-O2, to be 2-5 call frames deep of 3 line shims/stub functions before reaching the main logic. I can't tell what is a mathom and what isn't. So to lay provisions needed for future commits, that add proper SV*/HEK*/U32 hash precalculation, not to mention the memcmp() in hv_common() is skipped if left and right ptr addr are equal. The front end of gv_* needs cleanup. -move U32 flags to the start of the the func, so flags can encode details what void * Perl#1 means, and if vararg void * Perl#2 exists (PVN with N as size_t is only 2nd arg user right now). Since gv_stashpvs() is very common on core and CPAN, and called over and over in 1 proc, since most interp core and CPAN XS devs don't know GV*s have an RC that can be ++ed and stored in a MY_CXT struct. Also nobody knows "stashes" are HV*s or PP packages/classes are implimented with HV*s. So there is reason to pay extra attention to gv_stashpvs() b/c of its high usage/call sites per library. So if the STRLEN can be CC constant folded, and fits in a U8, store the length in the flags arg. Saves on CPU ops in all the callers to push 2 args, vs 3. Public API gv_stashpvs(str, create)'s create arg [flags in reality] can't be optimized away or removed, so combine the 2 CC time constant args, so they fold/optimize into 1 cpu op. -at some point perl core needs to cache/create/move around C level arrays of RC++ed ShHEKs to pass to the gv_*() APIs. SVPVs aren't exactly the right format for storing sanitized (no */::/'/SUPER/main/UNIVERSAL) and pre-parsed/splitted "package tokens", since SVs easily wind up or escape into PP-state, and SV RO flags/COW flags aren't the most honored and respected parts of the API by CPAN XS/maybe core. ShHEKs escaping into PP-state is rarer than "generic SVs" escaping into PP-state or CPAN XS state. All legacy XS code any quality and entry/beginner XS people, will pick "char *" getter macros vs an unknown opaque "HEK" type (and newSVpvn() to capture/move those char *s). Users who know what a HEK* is and how to RC++ it, know not to write to it. Also a bad write to a ShHEK will cause more PP or SEGV breakage/panics or proc exits, alot faster than a bad write to a SVfRO "SVPV" buffer. Hash doesn't match char string in a ShHEK will term the prc faster. So vararg on gv_*() is a provision for a future prototype, that accepts 1, 2, 3 or more HEK*s passed array style, that already were sanitized to not have ::s. 0xFF length was picked b/c there was bitfield space, shaving to 32/64/128 chars for gv_stashpvs(str, create) is possible if the bits are needed b/c a terminal is 80 chars, would fit almost all absolute ("::") C string package names, and everthing in core and CPAN. -the stubs remain as exported stub funcs, on purpose for now, it makes certain diag tools I use slightly easier to use vs optimized out inlines or macros. In 5.43 or 5.45 the exported stub funcs can be converted to macros no static inline, which is intent of this commit. The vararg is the 1 and only entry point to all of gv_stash* logic. -flipping I32 flags to the front requires "_p" suffixes for private for ABI reasons, public API still thinks I32 flags is always the last arg -since all front end wrappers, are 1-away from instead of multiple frames away, they are more likely to LTO inline away inside of libperl (not XS) on any CC. CCs have cost/benefit/wall time cut offs for scoring potential inlines opportunities. Going 2 layers, or 3+ layers of small inlines, is asking alot from a CC, that has to traverse a tree of nodes to do each inline, and the cut off could be as low as 1 inline fn and no more unrolling or folding.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
POC C static shared string HEKPOOL API.
Related/semi potential fix to #22872
Slightly related to my inter-ithread malloc ShHEK code in #14725
fails a couple tests, fails asserts on perl_destruct, has bugs where hvkvsplit isn't working correctly on HV* PL_strtab and not sorting collisions with malloc ShHEKs ontop of the LL and GShHEKs on the bottom of the LL
note all ithreads use/see the same GShHEKs ptrs. but all ithreads keep independent HV* PL_strtab
parts of this patch aren't fully baked, lazy loading/computing the hash keys which are per OS proc is done "lazy", but there are plenty of HEKs/strings, which are involuntarily forced by P5/P5P onto users
perl -e"0;"
not baked, initiallly i planned on duplicate hash num len bytes malloc HEKs and GHEKs both living at the same HvARRAY slot in PL_strtab as collisions, later GHEKs are sorted below malloc HEKs, later prohibiting creation of malloc HEKs that conflict/identical to GHEKs and vivi the GHEK instead, this search code, half baked, tried some optimizations like a bitfield of know lengths vs GCC's horrible multi KB C switch jump tables, and MSVC's massive if else trees (msvc -O1 doesn't include C switch jumptables (fn ptrs half way into a fn), msvc -O2 as a switch is too broken to ever use in production without using profile guided optimization and data training/profiling, but my rapid bitfield reject mask doesn't really reject very much, since | ing all of the per length, ascii UC A Z chars at each char position, 3 of 4 times or 1 of 2 created 0x5F the full AZ range, rejecting nothing
the malloc HEKs conflict/identical to GHEKs code needs a binary search algo char by char but it doesnt rn, b/c the token list is really long per len, but I cant find a single use of a binary search algo in the p5p repo in a .xs or .c except for win32/perlhost.h and Compress::Raw::* libs.