Skip to content

convert module initialization to use multi-phase-init #5142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

davidhewitt
Copy link
Member

Part of #2274, inspired by all of #2245, #4081, and #4162

This converts the module initialization itself to use module slots and multi-phase init APIs. Module state is deliberately not covered here and left for follow ups. I also don't change anything related to subinterpreter checking, that is also left for a follow up.

@davidhewitt
Copy link
Member Author

Note to self: this might need to be opt-out (for a few versions) and be mentioned in the migration guide.

@mejrs
Copy link
Member

mejrs commented May 30, 2025

I like this very much!

Is this compatible with how we are currently putting pyclass' type objects in statics? We should figure out a way to store them in module state if possible.

@davidhewitt
Copy link
Member Author

I think it's sorta fine to leave the type objects in statics for now given we don't support subinterpreters, but yes:

  • we should seek to use module state if we can anyway as a followup
  • and we'd need to do so for any potential eventual subinterpreter support

@alex
Copy link
Contributor

alex commented May 30, 2025

I suppose eventually we'll want to allow users to have their own module state as well for their statics :-/

@mejrs
Copy link
Member

mejrs commented May 30, 2025

I think it's sorta fine to leave the type objects in statics for now given we don't support subinterpreters, but yes:

* we should seek to use module state if we can anyway as a followup

* and we'd need to do so for any potential eventual subinterpreter support

This will also put the nail in the coffin for #1444. I'm fine with that personally, but it'll be disappointing for quite a lot of people. Maybe we should start thinking about other ways to address that use case.

@davidhewitt
Copy link
Member Author

I agree, I think there should be ways to take inspiration from how pybind11 makes that possible, with the caveat that Rust's lack of stable ABI makes things more tricky.

RE this PR - it looks like PyPy does not support all the APIs for multi phase init of the submodules, I will see if there's a way I can make that work with some conditional code. (🤮)

@alex
Copy link
Contributor

alex commented Jun 11, 2025

We should also ping the PyPy folks (h/t @mattip) with the APIs that are missing so that we can eventually remove the special cases.

@mattip
Copy link
Contributor

mattip commented Jun 11, 2025

PyPy does not support all the APIs for multi phase init of the submodules

I thought PyPy does support multi phase init, which APIs did I miss?

EDIT: I see a missing PyModule_ExecDef here but that should be PyPyModule_ExecDef. Maybe there needs to be a #[cfg_attr(PyPy, link_name = "PyPyModule_ExecDef")] decorator?

@davidhewitt
Copy link
Member Author

Yep PyModule_ExecDef our definition is wrong, but we also need PyModule_FromDefAndSpec2 which I couldn't see in the pypy exports.

@mattip
Copy link
Contributor

mattip commented Jun 13, 2025

I implemented the missing PyModule_FromDefAndSpec2 and PyModule_FromDefAndSpec on latest pypy-3.11 HEAD. With that, I will point out that cython does not use these low-level interfaces, rather it uses PyModuleDef_Init. Here is a small example in C that demonstrates a multiphase module with PyModuleDef_Init from the PyPy tests

C code using multiphase initialization
    #include <Python.h>

    
        static PyModuleDef multiphase_def;

        static PyObject* multiphase_create(PyObject *spec, PyModuleDef *def) {
            PyObject *module = PyModule_New("altname");
            PyObject_SetAttrString(module, "create_spec", spec);
            PyObject_SetAttrString(module, "create_def_eq",
                                   PyBool_FromLong(def == &multiphase_def));
            return module;
        }

        static int multiphase_exec(PyObject* module) {
            Py_INCREF(Py_True);
            PyObject_SetAttrString(module, "exec_called", Py_True);
            return 0;
        }

        static PyModuleDef_Slot multiphase_slots[] = {
            {Py_mod_create, multiphase_create},
            {Py_mod_exec, multiphase_exec},
            {0, NULL}
        };

        static PyModuleDef multiphase_def = {
            PyModuleDef_HEAD_INIT,                      /* m_base */
            "multiphase",                               /* m_name */
            "example docstring",                        /* m_doc */
            0,                                          /* m_size */
            NULL,                                       /* m_methods */
            multiphase_slots,                           /* m_slots */
            NULL,                                       /* m_traverse */
            NULL,                                       /* m_clear */
            NULL,                                       /* m_free */
        };
        

    PyMODINIT_FUNC
    PyInit_multiphase(void) {
    
        return PyModuleDef_Init(&multiphase_def);
        
    }
testing code
import multiphase
assert multiphase.create_spec
assert multiphase.create_spec is multiphase.__spec__
assert multiphase.create_def_eq
assert multiphase.exec_called

@davidhewitt
Copy link
Member Author

Thanks, that's perfect, I will seek to retest with pypy nightly.

I will point out that cython does not use these low-level interfaces, rather it uses PyModuleDef_Init.

Indeed, we primarily use that here too, but there are other code paths which directly create module objects from the definitions now written as multi-phase (e.g. submodule objects added to a root module during the exec phase)

@mattip
Copy link
Contributor

mattip commented Jun 13, 2025

Cool.

I will seek to retest with pypy nightly.

Sorry, I was a bit too late for last night's build, it will only appear tomorrow.

@davidhewitt davidhewitt added the CI-no-fail-fast If one job fails, allow the rest to keep testing label Jun 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-build-full CI-no-fail-fast If one job fails, allow the rest to keep testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants