Skip to content

Conversation

mlabeeb03
Copy link
Contributor

@mlabeeb03 mlabeeb03 commented Aug 6, 2025

closes #1208
This PR is an enhancement meant to allow the ability to install python packages at runtime without the need to rebuild images as explained in #1208.

This requires minio to work. We are not using mounted volumes because those would not work with k8s.

This change introduces:

  1. A do command that installs the python packages in a folder through job runner and then uploading the zip file to django storage.
  2. A job uwsgi daemon inside the lms and cms containers that can detect when the file has been uploaded in django storage and triggers a reload.
  3. A script that runs on each uwsgi reload and downloads the latest dependencies.
  4. Addition to the PYTHOPATH variable so the newelly downloaded packages can be discovered.

In order to use this you should build the openedx docker image and add packages name to the LIVE_DEPENDENCIES config variable:
tutor config save --append LIVE_DEPENDENCIES="ai-coach-xblock"
And then run the do command:
tutor local do build-live-dependencies
Your lms and cms containers would automatically detect the change after 10 seconds and you would be able to access these packages/xblocks in studio and lms.

This change introduces:
1) A persistent volume where Xblocks can be installed at runtime without the need to rebuild images.
2) A do command that actually installs the Xblock and its dependecies inside said volume.
3) Addition to the PYTHOPATH variable so the newelly installed Xblocks can be discovered.
@mlabeeb03 mlabeeb03 marked this pull request as draft August 6, 2025 11:14
@github-project-automation github-project-automation bot moved this to Pending Triage in Tutor project management Aug 6, 2025
@mlabeeb03 mlabeeb03 changed the title feat: add xblocks at runtime without rebuilding image [WIP] feat: add xblocks at runtime without rebuilding image Aug 6, 2025
@regisb
Copy link
Contributor

regisb commented Aug 6, 2025

This is interesting, as a first step, but we need to address a few issues before we actually start supporting this. Let me ask a few questions then offer my own answers:

  1. Q: How to make this work for Kubernetes? A: by storing the install dir in Django Storage, as a zip file.
  2. Q: After an xblock was installed, how do we reload the uwsgi server to apply the changes? A: With touch-workers-reload
  3. Q: How do we list currently installed extensions? A: By storing the list of installed xblocks/packages in a Tutor configuration setting.
  4. Q: How do we uninstall a package? A: by removing the corresponding entry from that list, then rebuild the dependency zip file from scratch.
  5. Q: Can we do the same thing with edx-platform Django plugins? A: I'm not sure. We would need to automatically apply migrations and collect static assets. Or serve static assets directly. I don't know how to do that.

I think my answers to questions 1-5 are sufficient. I'm stuck at 6, though.

@mlabeeb03
Copy link
Contributor Author

Would we be supporting Kubernetes and Django apps from the start or send patches later on?
Should I ditch volume mounts and move to Django storage since we need to support Kubernetes eventually.

I can see that we should probably have a separate command group for xblocks which supports:
tutor xblocks install
tutor xblocks list
tutor xblocks uninstall
tutor xblocks reset/rebuild

Similarly for Django apps:
tutor djapps install
tutor djapps list
tutor djapps uninstall
tutor djapps reset/rebuild

A txbi (tutor xblock index) similar to what we have with tpi (tutor plugins index) so xblocks integrate nicely with tutor deck.

@mlabeeb03
Copy link
Contributor Author

@regisb I added the uninstall and list functions. I do need help understanding why we can't use volumes for k8s as well. What problem does a zip file in django storage solve?

@regisb
Copy link
Contributor

regisb commented Aug 8, 2025

Would we be supporting Kubernetes and Django apps from the start or send patches later on?

We can support just docker compose for now, but we need an approach that will work with Kubernetes in the future.

I do need help understanding why we can't use volumes for k8s as well.

In general, it's very difficult to create volumes with write access that are accessible by multiple nodes in Kubernetes. Think: many different LMS/CMS containers which all need to access the same volume. This is why media storage is usually moved to S3 or MinIO in k8s.

@regisb
Copy link
Contributor

regisb commented Aug 8, 2025

FYI this is what I have for installation of dependencies in a pip prefix that is then stored as a zip file in django storage:

import os
import shlex
import shutil
import subprocess
import sys
import tempfile

HERE = os.path.abspath(os.path.dirname(__file__))
DEPS_ZIP_PATH = os.path.join(HERE, "deps.zip")
DEPS_PATH = os.path.join(HERE, "deps/")
DEPS = ["django", "appdirs", "ai-coach-xblock"]

def main():
    for command in sys.argv[1:]:
        if command == "build":
            build()
        elif command == "append":
            append()
        elif command == "install":
            install()
        elif command == "load":
            load()
        elif command == "test":
            test()
        else:
            raise ValueError(f"Unknown command: {command}")


def build():
    with tempfile.TemporaryDirectory(prefix="tutor-deps-") as build_dir:
        _pip_install(DEPS, build_dir)

        # If zip file exists, delete it
        # TODO move this elsewhere?
        if os.path.exists(DEPS_ZIP_PATH):
            os.remove(DEPS_ZIP_PATH)

        _make_zip_archive(build_dir)

def append():
    new_deps = ["tutor"]
    with tempfile.TemporaryDirectory(prefix="tutor-deps-") as build_dir:
        # Unpack existing archive
        shutil.unpack_archive(DEPS_ZIP_PATH, extract_dir=build_dir)
        _pip_install(new_deps, build_dir)
        _make_zip_archive(build_dir)

def install():
    # TODO move this elsewhere?
    if not os.path.exists(DEPS_ZIP_PATH):
        raise RuntimeError(f"{DEPS_ZIP_PATH} does not exist")

    # TODO move this elsewhere?
    if os.path.exists(DEPS_PATH):
        shutil.rmtree(DEPS_PATH)

    # Unpack zip archive
    shutil.unpack_archive(DEPS_ZIP_PATH, extract_dir=DEPS_PATH)

def load():
    # Load dependencies into python path
    sys.path.append(
        os.path.join(
            DEPS_PATH,
            "lib",
            f"python{sys.version_info.major}.{sys.version_info.minor}",
            "site-packages",
        )
    )

def test():
    import tutor
    import importlib_metadata
    xblocks = list(importlib_metadata.entry_points(group="xblock.v1"))
    if not xblocks:
        raise ValueError("no xblocks found!")
    print(xblocks)


def _pip_install(deps, prefix_dir):
    for dep in deps:
            check_call("pip", "install", f"--prefix={prefix_dir}", dep)

def _make_zip_archive(src_dir):
    """
    Create in tmp dir and then move. That way we improve the worker reloading
    """
    with tempfile.TemporaryDirectory(prefix="tutor-depszip-") as zip_dir:
        # Create zip file
        path = os.path.join(zip_dir, "deps.zip")
        shutil.make_archive(path[:-4], format="zip", root_dir=src_dir)

        # Move
        os.rename(path, DEPS_ZIP_PATH)

def check_call(*args):
    print(shlex.join(args))
    subprocess.check_call(args)


if __name__ == "__main__":
    main()

@DawoudSheraz DawoudSheraz moved this from Pending Triage to In Progress in Tutor project management Aug 11, 2025
@regisb
Copy link
Contributor

regisb commented Aug 14, 2025

I am facing the following issues with the current PR as it stands:

On the first run, there is no .uwsgi_trigger file and uwsgi is failing with the following error: lms-1 | unable to stat() /mnt/third-party-xblock/.uwsgi_trigger, events will be triggered as soon as the file is created

I can attempt to create the file by running: tutor local run lms touch /mnt/third-party-xblock/.uwsgi_trigger. But this is failing with the following error:

touch: cannot touch '/mnt/third-party-xblock/.uwsgi_trigger': Permission denied

I fixed this by running: sudo chown 1000:1000 ~/.local/share/tutor/data/third-party-xblock/

Then, my uwsgi processes are not restarting when I simply "touch" the file. I need to write content to it to trigger a reload. I had to update the PR as follows:

script = f"pip install --prefix=/mnt/third-party-xblock {package} && echo \"$(date)\" > /mnt/third-party-xblock/.uwsgi_trigger"

Still, I managed to get this to work. So let's keep it as it is for now and revisit after the demo.

@kdmccormick
Copy link
Collaborator

Cool!!

@regisb
Copy link
Contributor

regisb commented Aug 14, 2025

Note that live dependencies have some limitations for now. We are not handling: migrations, translations or static assets.

@mlabeeb03
Copy link
Contributor Author

touch: cannot touch '/mnt/third-party-xblock/.uwsgi_trigger': Permission denied

@regisb Can you see the mounted volume in the job runner container? I had the same problem but it went away when I updated the volumes for job runners.

unable to stat()

This should go away the first time you touch the file, still I can look into it so its already present on the first run.

@mlabeeb03
Copy link
Contributor Author

Update: I've added a few new tutor commands tutor packages list, tutor packages append, tutor packages remove, tutor packages build. The goal was to not give write access to containers and instead use the host machine to install these packages. The only problem with this approach is that dependencies are built for the host machine and will not work in the container once they overwrite those already present in the container, for example lxml. Which means we have to use the --no-deps flag and install each missing dependency separately so that the already installed packages are not affected.
The do command works fine though but i am unsure if we can use that with kubernetes, i.e. Will tutor k8s do pip-install xblock install the xblock in mounted directory and restart the uWSGI server in each pod?

@mlabeeb03 mlabeeb03 force-pushed the labeeb/install-xblocks-at-runtime branch from 61ea255 to 6ad39cd Compare August 26, 2025 12:38
@mlabeeb03 mlabeeb03 force-pushed the labeeb/install-xblocks-at-runtime branch from 6ad39cd to 08ea6ad Compare August 26, 2025 12:41
@mlabeeb03 mlabeeb03 requested a review from DawoudSheraz August 26, 2025 12:46
Copy link
Contributor

@DawoudSheraz DawoudSheraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • PR is still in draft
  • The readme and/or documentation updates are missing. This is a new feature and we need to add proper documentation for community to follow
  • Add if anything is intended for a followup PR (any k8s related pending action)

@mlabeeb03
Copy link
Contributor Author

mlabeeb03 commented Aug 28, 2025

This version will work for both docker-compose and k8s (still requires additional k8s configs) but is now dependent on minio plugin. Having core depend on a plugin does not seem like the right way. Perhaps this should be a separate xblocks installer plugins that depends on minio.
Note that if we were to just support docker-compose we could do this in the core itself without using minio however supporting k8s means we need to use remote object storage for this to work on a multi node cluster.

@mlabeeb03 mlabeeb03 force-pushed the labeeb/install-xblocks-at-runtime branch from 7ffa000 to 44f50d4 Compare August 28, 2025 11:37
@DawoudSheraz
Copy link
Contributor

This version will work for both docker-compose and k8s (still requires additional k8s configs) but is now dependent on minio plugin. Having core depend on a plugin does not seem like the right way. Perhaps this should be a separate xblocks installer plugins that depends on minio. Note that if we were to just support docker-compose we could do this in the core itself without using minio however supporting k8s means we need to use remote object storage for this to work on a multi node cluster.

Let's continue this discussion internally. Having the core depend upon a plugin is not desirable. We can have a dedicated plugin to achieve this if we cannot find a way without minio.

Copy link
Contributor

@regisb regisb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a misunderstanding with how MinIO should be used. It is unnecessary to depend on MinIO, boto3 or s3. Tutor does not need to know about the storage backend. Instead, Tutor should use Django Storages for loading/saving the live dependencies.

It is then the role of the MinIO plugin (or whichever plugin we use for storage, such as s3) to configure Django Storages to use MinIO (or s3), as it currently done: https://github.com/overhangio/tutor-minio/blob/04c5f1149b51e12a33f10caab9428425a69aed7a/tutorminio/patches/openedx-common-settings#L1

@mlabeeb03 mlabeeb03 force-pushed the labeeb/install-xblocks-at-runtime branch from 36345c2 to 73272a7 Compare September 4, 2025 09:35
@mlabeeb03 mlabeeb03 force-pushed the labeeb/install-xblocks-at-runtime branch from 73272a7 to dd3eb35 Compare September 4, 2025 11:10
@Squirrel18
Copy link

Hello, everyone. I am new to this conversation and to Tutor development, so if I am missing something or this is not the right place, please let me know.

I am currently working on a Tutor plugin to set up and configure development environments on top of Tutor's current development capabilities. At the moment, for development environments, we can work with a local repository of edx-platform using Tutor mounts, but, when working with xblocks or djangoapps, things change, and after a quick search, there are different ways to approach this.

In my opinion, having live dependencies (or a way to install local Python packages with pip inside of a running container) only makes sense when working in development environments, just like with the openedx dev image, which has vim or ipdb only for dev environments.

For end environments (local or production) using K8s, having live dependencies is, in my opinion, like changing a stateless deployment to an almost statefulset one, since the openedx deployment would now depend on an underlying volume to work, or at least to work in the desired manner.
For containerised applications, the image must contain everything necessary for the application to work. Therefore, for local environments, the image must always be built in such a way that it reflects the new Python packages required for a specific installation, rather than relying on certain Python packages being injected once the container is up and running.

I believe that if we limited this to dev environments, we would only need to rely on the state of the Docker container to share the volumes where the xblocks and djangoapps are located and then run pip install inside the container as needed, since in other environments (local) it should not be possible to do so unless a new image was created with all the packages necessary for the application to work as expected. Looking forward to continue this conversation, thanks!.

@mlabeeb03
Copy link
Contributor Author

@Squirrel18 This is an anti pattern. I am not going to deny that. Here is our rational behind doing this:

With the development of Tutor Deck, we will have to develop many plugins, and users will install many more of them. How can we install extensions in edx-platform without “stopping the world”? (i.e: running “tutor local launch” and rebuild the “openedx” Docker image) This is a very slow process, and users need to enable plugins faster.

We will allow this on all environments (dev/local/k8s) and it is going to be up to the user to decide where they want to use it. The classic way of installing dependencies will always be there though.

@Squirrel18
Copy link

Hi, @mlabeeb03 Thank you very much for the explanation and sharing the rationale behind.

Just out of curiosity, I have two questions:

  1. If this is necessary based (mainly) on Tutor Deck requirements, should this functionality be in the Tutor Deck plugin rather than be part of the Tutor core? Even if, as operators, we will have the option to use it or not?
  2. As I see it, we depend on the host machine to compile the requirements. Could something like https://devpi.net/docs/devpi/devpi/stable/%2Bd/index.html handle the management and storage of Python packages? So, this live dependency or runtime package injection functionality only be responsible for consuming packages (public or private) from the local devpi and pip installing/uninstalling these requirements, instead of managing packaging and unpackaging, storage, volume handling, etc.

Thanks again!

@mlabeeb03
Copy link
Contributor Author

@Squirrel18 Yes, this will be moved to a separate plugin. This PR was opened in tutor core for higher visibility.
Regarding your second point, I am not sure if using devpi simplifies our use case. Here is what happens now.

Uploading: Job runner container downloads packages from pypy -> zip them -> upload to django storage (minio in this case).
Downloading: A daemon inside each container detects that django storage is updated -> restarts uwsgi server -> uwsgi server downloads/extracts the packages on each start/restart.

Where would devpi come into this?

@Squirrel18
Copy link

Squirrel18 commented Sep 15, 2025

Hi, @mlabeeb03 You are correct, according to the PR description, the approach is to use PYTHOPATH to add live dependencies and I think avoid using pip to install or uninstall Python packages (inside openedx containers), so Devpi will not simplify the approach. I was thinking of using Devpi to centralise package management and storage (for public and private repositories), but since the goal is to avoid using pip, it doesn't make sense to use it.

Thank you for taking the time to answer my questions and to provide more context about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Add ability to install third-party xblocks at runtime without needing to rebuild docker image (docker-compose only)
5 participants