Skip to content

Better uop coverage in the JIT optimizer #131798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandtbucher opened this issue Mar 27, 2025 · 16 comments
Open

Better uop coverage in the JIT optimizer #131798

brandtbucher opened this issue Mar 27, 2025 · 16 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-JIT type-feature A feature request or enhancement

Comments

@brandtbucher
Copy link
Member

brandtbucher commented Mar 27, 2025

Out of 263 total uops, 155 of these are ignored by the tier two optimizer. These represent over half of all uops by dynamic execution count.

This issue will serve as a checklist for auditing these missing uops, and adding them where they make sense. At first glance, there's quite a bit of potential here... especially around ability to narrow known output types (like _CONTAINS_OP_SET), and the ability to narrow and remove guards on input types (like _BINARY_OP_SUBSCR_LIST_INT). As I'm going through, I'll cross out anything that doesn't seem like it makes sense to add.

First, here are the 53 missing uops that each represent at least 0.1% of all uops executed:

  • _SET_IP (12.1%)
  • _CHECK_VALIDITY (10.1%)
  • _CHECK_VALIDITY_AND_SET_IP (6.5%)
  • _CHECK_PERIODIC (3.1%)
  • _MAKE_WARM (2.8%)
  • _START_EXECUTOR (1.7%)
  • _GUARD_NOS_INT (1.5%)
  • _BINARY_OP_SUBSCR_LIST_INT (1.0%)
  • _CHECK_FUNCTION (1.0%)
  • _CHECK_MANAGED_OBJECT_HAS_VALUES (0.7%)
  • _ITER_CHECK_LIST (0.7%)
  • _CONTAINS_OP_SET (0.6%)
  • _FOR_ITER_TIER_TWO (0.6%)
  • _GUARD_NOT_EXHAUSTED_LIST (0.6%)
  • _ITER_NEXT_LIST_TIER_TWO (0.6%)
  • _SAVE_RETURN_OFFSET (0.6%)
  • _CALL_LEN (0.5%)
  • _CALL_LIST_APPEND (0.5%)
  • _POP_TOP (0.5%)
  • _RESUME_CHECK (0.5%)
  • _BINARY_OP_SUBSCR_STR_INT (0.4%)
  • _GUARD_DORV_VALUES_INST_ATTR_FROM_DICT (0.4%)
  • _GUARD_KEYS_VERSION (0.4%)
  • _BINARY_OP_SUBSCR_DICT (0.3%)
  • _CALL_BUILTIN_FAST (0.3%)
  • _CHECK_STACK_SPACE_OPERAND (0.3%)
  • _GET_ITER (0.3%)
  • _STORE_SUBSCR (0.3%)
  • _GUARD_NOT_EXHAUSTED_RANGE (0.2%)
  • _BINARY_SLICE (0.2%)
  • _BUILD_LIST (0.2%)
  • _CALL_BUILTIN_O (0.2%)
  • _CALL_NON_PY_GENERAL (0.2%)
  • _CHECK_IS_NOT_PY_CALLABLE (0.2%)
  • _GUARD_NOS_FLOAT (0.2%)
  • _ITER_CHECK_RANGE (0.2%)
  • _ITER_CHECK_TUPLE (0.2%)
  • _LOAD_DEREF (0.2%)
  • _STORE_SUBSCR_LIST_INT (0.2%)
  • _BINARY_OP_EXTEND (0.1%)
  • _CALL_ISINSTANCE (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST_WITH_KEYWORDS (0.1%)
  • _CALL_METHOD_DESCRIPTOR_NOARGS (0.1%)
  • _CALL_TYPE_1 (0.1%)
  • _CHECK_ATTR_CLASS (0.1%)
  • _CONTAINS_OP_DICT (0.1%)
  • _GUARD_BINARY_OP_EXTEND (0.1%)
  • _GUARD_NOT_EXHAUSTED_TUPLE (0.1%)
  • _ITER_NEXT_TUPLE (0.1%)
  • _LIST_APPEND (0.1%)
  • _STORE_ATTR_SLOT (0.1%)
  • _STORE_SUBSCR_DICT (0.1%)

And here are the 102 missing uops that are less than 0.1%. These are less important, but still may net us some wins on individual benchmarks:

  • _BINARY_OP_SUBSCR_CHECK_FUNC
  • _BINARY_OP_SUBSCR_TUPLE_INT
  • _BUILD_MAP
  • _BUILD_SET
  • _BUILD_SLICE
  • _BUILD_STRING
  • _CALL_BUILTIN_CLASS
  • _CALL_BUILTIN_FAST_WITH_KEYWORDS
  • _CALL_INTRINSIC_1
  • _CALL_INTRINSIC_2
  • _CALL_KW_NON_PY
  • _CALL_METHOD_DESCRIPTOR_O
  • _CALL_STR_1
  • _CALL_TUPLE_1
  • _CHECK_ATTR_METHOD_LAZY_DICT
  • _CHECK_EG_MATCH
  • _CHECK_EXC_MATCH
  • _CHECK_FUNCTION_VERSION_INLINE
  • _CHECK_FUNCTION_VERSION_KW
  • _CHECK_IS_NOT_PY_CALLABLE_KW
  • _CHECK_METHOD_VERSION
  • _CHECK_METHOD_VERSION_KW
  • _CHECK_PERIODIC_IF_NOT_YIELD_FROM
  • _CONVERT_VALUE
  • _COPY_FREE_VARS
  • _DELETE_ATTR
  • _DELETE_DEREF
  • _DELETE_FAST
  • _DELETE_GLOBAL
  • _DELETE_NAME
  • _DELETE_SUBSCR
  • _DEOPT
  • _DICT_MERGE
  • _DICT_UPDATE
  • _END_FOR
  • _END_SEND
  • _ERROR_POP_N
  • _EXIT_INIT_CHECK
  • _EXPAND_METHOD
  • _EXPAND_METHOD_KW
  • _FATAL_ERROR
  • _FORMAT_SIMPLE
  • _FORMAT_WITH_SPEC
  • _GET_AITER
  • _GET_ANEXT
  • _GET_AWAITABLE
  • _GET_LEN
  • _GET_YIELD_FROM_ITER
  • _GUARD_DORV_NO_DICT
  • _GUARD_GLOBALS_VERSION
  • _GUARD_TOS_FLOAT
  • _GUARD_TOS_INT
  • _GUARD_TYPE_VERSION_AND_LOCK
  • _IMPORT_FROM
  • _IMPORT_NAME
  • _IS_NONE
  • _LIST_EXTEND
  • _LOAD_ATTR_NONDESCRIPTOR_NO_DICT
  • _LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES
  • _LOAD_BUILD_CLASS
  • _LOAD_COMMON_CONSTANT
  • _LOAD_FAST_LOAD_FAST
  • _LOAD_FROM_DICT_OR_DEREF
  • _LOAD_GLOBAL
  • _LOAD_GLOBAL_BUILTINS
  • _LOAD_GLOBAL_MODULE
  • _LOAD_LOCALS
  • _LOAD_NAME
  • _LOAD_SUPER_ATTR_ATTR
  • _LOAD_SUPER_ATTR_METHOD
  • _MAKE_CALLARGS_A_TUPLE
  • _MAKE_CELL
  • _MAKE_FUNCTION
  • _MAP_ADD
  • _MATCH_CLASS
  • _MATCH_KEYS
  • _MATCH_MAPPING
  • _MATCH_SEQUENCE
  • _MAYBE_EXPAND_METHOD_KW
  • _NOP
  • _POP_EXCEPT
  • _POP_TWO_LOAD_CONST_INLINE_BORROW
  • _PUSH_EXC_INFO
  • _PUSH_NULL_CONDITIONAL
  • _SETUP_ANNOTATIONS
  • _SET_ADD
  • _SET_FUNCTION_ATTRIBUTE
  • _SET_UPDATE
  • _STORE_ATTR
  • _STORE_ATTR_INSTANCE_VALUE
  • _STORE_ATTR_WITH_HINT
  • _STORE_DEREF
  • _STORE_FAST_LOAD_FAST
  • _STORE_FAST_STORE_FAST
  • _STORE_GLOBAL
  • _STORE_NAME
  • _STORE_SLICE
  • _TIER2_RESUME_CHECK
  • _UNARY_INVERT
  • _UNARY_NEGATIVE
  • _UNPACK_SEQUENCE_LIST
  • _WITH_EXCEPT_START

Linked PRs

@brandtbucher
Copy link
Member Author

brandtbucher commented Apr 1, 2025

@diegorusso is going to add _CALL_LEN.

@brandtbucher
Copy link
Member Author

brandtbucher commented Apr 3, 2025

@fluhus is going to add _BINARY_SLICE.

@brandtbucher
Copy link
Member Author

@Klaus117 is going to improve _TO_BOOL_INT.

@brandtbucher
Copy link
Member Author

@Zheaoli is going to add _CONTAINS_OP_DICT.

@Zheaoli
Copy link
Contributor

Zheaoli commented Apr 8, 2025

I think I can work on _BINARY_OP_SUBSCR_LIST_INT and _BINARY_OP_SUBSCR_DICT

@brandtbucher
Copy link
Member Author

I think I can work on _BINARY_OP_SUBSCR_LIST_INT and _BINARY_OP_SUBSCR_DICT

Sorry, I already have a branch to do the guards for these (and a couple others) that I was going to up in a minute! I'll tag you for review though.

@brandtbucher
Copy link
Member Author

@tomasr8 is going to add _CALL_STR_1, _CALL_TUPLE_1, and _CALL_TYPE_1.

@Zheaoli, want to take _BUILD_LIST, _BUILD_MAP, _BUILD_SET, _BUILD_SLICE, and _BUILD_STRING? For all but _BUILD_SLICE and _BUILD_STRING, I think we can only set the type of the output (sym_new_type). For _BUILD_SLICE and _BUILD_STRING, we may be able to have a constant output if the items are constant (sym_is_const/sym_get_const/sym_new_const). Maybe one PR for the first three, and separate PRs for _BUILD_SLICE and _BUILD_STRING?

@brandtbucher
Copy link
Member Author

Yep, your understanding is correct!

Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 12, 2025
… _BUILD_LIST, _BUILD_SET, _BUILD_MAP

Signed-off-by: Manjusaka <[email protected]>
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 12, 2025
… _BUILD_LIST, _BUILD_SET, _BUILD_MAP

Signed-off-by: Manjusaka <[email protected]>
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 15, 2025
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 15, 2025
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 15, 2025
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 15, 2025
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 16, 2025
… _BUILD_LIST, _BUILD_SET, _BUILD_MAP

Signed-off-by: Manjusaka <[email protected]>
Fidget-Spinner pushed a commit that referenced this issue Apr 16, 2025
…LD_LIST`, `_BUILD_SLICE`, and `_BUILD_MAP` (GH-132434)

---------

Signed-off-by: Manjusaka <[email protected]>
diegorusso added a commit to diegorusso/cpython that referenced this issue Apr 25, 2025
Reduce unnecessary guards whenever `len()` is called and used
in arithmetic operations.
Fidget-Spinner pushed a commit that referenced this issue Apr 25, 2025
Reduce unnecessary guards whenever `len()` is called and used
after.

Co-authored-by: Max Bernstein <[email protected]>
Fidget-Spinner pushed a commit that referenced this issue Apr 27, 2025
… _BUILD_STRING, _BUILD_SET (GH-132564)

Signed-off-by: Manjusaka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-JIT type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

6 participants