Using llm-sandbox Inside Docker and Container Strategy for Concurrent LLM API Calls #85

IoannisMaras · 2025-07-16T19:02:35Z

IoannisMaras
Jul 16, 2025

Hi, I'm a bit confused about how llm-sandbox is intended to work in a Dockerized setup.

My application already runs inside a Docker container. Is it possible to use llm-sandbox from within that container? What exactly would happen in that case? If it's not compatible out of the box, would it be possible to make it work by using an already running container?

Secondly, I plan to build an API that will handle concurrent LLM calls. In that context, would it be advisable to run all llm-sandbox executions within a single shared container? I understand this might introduce a minor security risk, but it's one I'm willing to accept for now.

Thanks in advance for your guidance!

Answered by vndee

Jul 18, 2025

Hi @IoannisMaras!

You're right that Docker doesn't natively support running Docker inside Docker, but llm-sandbox has you covered! The key is using remote Docker client support. Instead of trying to run Docker containers from within your existing container, you can configure llm-sandbox to connect to a remote Docker daemon.

import docker
from llm_sandbox import SandboxSession

# Configure TLS connection to remote Docker host
tls_config = docker.tls.TLSConfig(
    client_cert=("path/to/cert.pem", "path/to/key.pem"),
    ca_cert="path/to/ca.pem",
    verify=True
)
docker_client = docker.DockerClient(base_url="tcp://<your_host>:<port>", tls=tls_config)

with SandboxSession(
    client=docker…

View full answer

vndee · 2025-07-18T06:47:16Z

vndee
Jul 18, 2025
Maintainer

Hi @IoannisMaras!

You're right that Docker doesn't natively support running Docker inside Docker, but llm-sandbox has you covered! The key is using remote Docker client support. Instead of trying to run Docker containers from within your existing container, you can configure llm-sandbox to connect to a remote Docker daemon.

import docker
from llm_sandbox import SandboxSession

# Configure TLS connection to remote Docker host
tls_config = docker.tls.TLSConfig(
    client_cert=("path/to/cert.pem", "path/to/key.pem"),
    ca_cert="path/to/ca.pem",
    verify=True
)
docker_client = docker.DockerClient(base_url="tcp://<your_host>:<port>", tls=tls_config)

with SandboxSession(
    client=docker_client,
    image="python:3.9.19-bullseye",
    keep_template=True,
    lang="python",
) as session:
    result = session.run("print('Hello, World!')")
    print(result)

For handling concurrent LLM API calls, I'd strongly recommend using Kubernetes instead of sharing a single container. Here's why:

True isolation: Each API call gets its own pod/container
Better resource management: K8s handles scheduling and resource allocation
Scalability: Auto-scaling based on demand
No shared state issues: Eliminates the security and stability risks of shared containers

llm-sandbox actually supports connecting to existing containers/pods (as mentioned in issue #51), which enables you to implement a KubernetesContainerPool pattern:

from kubernetes import client, config
from llm_sandbox import SandboxSession, SandboxBackend

class KubernetesContainerPool:
    def __init__(self, namespace="default", pool_size=5):
        config.load_kube_config()  # or load_incluster_config()
        self.k8s_client = client.CoreV1Api()
        self.namespace = namespace
        self.pool_size = pool_size
        self.deployment_name = "llm-sandbox-pool"
        
    def setup_pool_deployment(self):
        """Create a Deployment to maintain pre-warmed pods"""
        deployment_manifest = {
            "apiVersion": "apps/v1",
            "kind": "Deployment", 
            "metadata": {"name": self.deployment_name, "namespace": self.namespace},
            "spec": {
                "replicas": self.pool_size,
                "selector": {"matchLabels": {"app": "llm-sandbox-pool"}},
                "template": {
                    "metadata": {"labels": {"app": "llm-sandbox-pool"}},
                    "spec": {
                        "containers": [{
                            "name": "sandbox",
                            "image": "ghcr.io/vndee/sandbox-python-311-bullseye",
                            "command": ["tail", "-f", "/dev/null"],  # Keep running
                            "resources": {
                                "requests": {"cpu": "100m", "memory": "256Mi"},
                                "limits": {"cpu": "1", "memory": "1Gi"}
                            }
                        }]
                    }
                }
            }
        }
        
        apps_v1 = client.AppsV1Api()
        apps_v1.create_namespaced_deployment(
            namespace=self.namespace, 
            body=deployment_manifest
        )
        
    def execute_code_concurrent(self, code, libraries=None):
        """Execute code in a fresh pod from the pool"""
        # Get available pod
        pods = self.k8s_client.list_namespaced_pod(
            namespace=self.namespace,
            label_selector="app=llm-sandbox-pool"
        )
        
        pod_name = None
        for pod in pods.items:
            if pod.status.phase == "Running":
                pod_name = pod.metadata.name
                break
        
        if not pod_name:
            raise Exception("No available pods in pool")
        
        try:
            # Connect to existing pod using container_id parameter
            with SandboxSession(
                backend=SandboxBackend.KUBERNETES,
                container_id=pod_name,  # This is the key feature!
                lang="python",
                client=self.k8s_client
            ) as session:
                result = session.run(code, libraries=libraries)
                return result
                
        finally:
            # Delete the used pod - Deployment will create a new one
            self.k8s_client.delete_namespaced_pod(
                name=pod_name, 
                namespace=self.namespace
            )

# Usage for concurrent API calls
pool = KubernetesContainerPool(pool_size=10)
pool.setup_pool_deployment()

# Each API call gets a fresh, pre-warmed pod
result = pool.execute_code_concurrent("""
import pandas as pd
print("Fast execution with fresh isolation!")
""", libraries=["pandas"])

I am thinking about adding this connection pool feature for k8s directly into llm-sandbox too.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using llm-sandbox Inside Docker and Container Strategy for Concurrent LLM API Calls #85

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Using llm-sandbox Inside Docker and Container Strategy for Concurrent LLM API Calls #85

Uh oh!

IoannisMaras Jul 16, 2025

Replies: 1 comment

Uh oh!

Uh oh!

vndee Jul 18, 2025 Maintainer

IoannisMaras
Jul 16, 2025

vndee
Jul 18, 2025
Maintainer