diff --git a/docs/hub/overview.md b/docs/hub/overview.md new file mode 100644 index 00000000..5e6c7b34 --- /dev/null +++ b/docs/hub/overview.md @@ -0,0 +1,99 @@ +--- +title: "Overview" +description: "The RunPod Hub is a centralized repository for discovering, sharing, and deploying preconfigured AI repos optimized for RunPod's Serverless infrastructure." +sidebar_position: 1 +--- + +# RunPod Hub overview + +:::info + +The RunPod Hub is currently in **beta**. We're actively developing features and gathering user feedback to improve the experience. Please [join our discord](https://discord.gg/runpod) if you'd like to provide feedback. + +::: + +The RunPod Hub is a centralized repository that enables users to **discover, share, and deploy preconfigured AI repos** optimized for RunPod's [Serverless infrastructure](/serverless/overview/). It serves as a catalog of vetted, open-source repositories that can be deployed with minimal setup, creating a collaborative ecosystem for AI developers and users. + +Whether you're a developer looking to **share your work** or a user seeking **preconfigured solutions**, the Hub makes discovering and deploying AI projects seamless and efficient. + +Screenshot of the Hub page in the RunPod console + +## Why use the Hub? + +The Hub simplifies the entire lifecycle of repo sharing and deployment, from initial submission through testing, discovery, and usage. + +### For RunPod users + +- **Find production-ready AI solutions**: Discover vetted, open-source repositories optimized for RunPod with minimal setup required. +- **Deploy in one click**: Go from discovery to running services in minutes, not days. +- **Customize to your needs**: RunPod Hub repos expose configurable parameters for fine-tuning without diving into code. +- **Save development time**: Leverage community innovations instead of building from scratch. + +### For Hub creators + +- **Showcase your work**: Share your projects with the broader AI community. +- **Maintain control**: Your GitHub repo remains the source of truth, while the Hub automatically detects new releases. +- **Streamline your workflow**: Automated building and testing ensures your releases work as expected. + +## How it works + +The Hub operates through several key components working together: + +1. **Repository integration**: The Hub connects with GitHub repositories, using GitHub releases (not commits) as the basis for versioning and updates. +2. **Configuration system**: Repositories use standardized configuration files (`hub.json` and `tests.json`) in a `.runpod` directory to define metadata, hardware requirements, and test procedures. See the [publishing guide](/hub/publishing-guide) to learn more. +3. **Automated build pipeline**: When a repository is submitted or updated, the Hub automatically scans, builds, and tests it to ensure it works correctly on RunPod’s infrastructure. +4. **Continuous release monitoring**: The system regularly checks for new releases in registered repositories and rebuilds them when updates are detected. +5. **Deployment interface**: Users can browse repos, customize parameters, and deploy them to RunPod infrastructure with minimal configuration. + +## Getting started + +Whether you're a veteran developer who wants to share your work or a newcomer exploring AI models for the first time, the RunPod Hub makes getting started quick and straightforward. + +### Deploy a repo from the Hub + +You can deploy a repo from the Hub in seconds: + +1. Navigate to the [Hub page](https://www.runpod.io/console/hub) in the RunPod console. +2. Browse the collection and select a repo that matches your needs. +3. Review the repo details, including hardware requirements and available configuration options to ensure compatibility with your use case. +4. Click the **Deploy** button in the top-right of the repo page. You can also use the dropdown menu to deploy an older version. +5. Click **Create Endpoint** + +Within minutes you'll have access to a new Serverless endpoint, ready for integration with your applications or experimentation. + +### Publish your own repo + +Sharing your work through the Hub starts with preparing your GitHub repository with a working [Serverless endpoint](/serverless/overview) implementation, comprised of a [handler function](/serverless/handlers/overview) and `Dockerfile`. To learn how to create your first endpoint, [follow this guide](/serverless/get-started). + +Once your endpoint is ready to share: + +1. Add the required configuration files in a `.runpod` directory, following the instructions in the [Hub publishing guide](/hub/publishing-guide). +2. Create a GitHub release to establish a versioned snapshot. +3. Submit your repository to the Hub through the RunPod console, where it will undergo automated building and testing. +4. The RunPod team will review your repo. After approval, your repo will appear in the Hub. + +To learn more, see the [Hub publishing guide](/hub/publishing-guide). + +## Use cases + +The RunPod Hub supports a wide range of AI applications and workflows. Here are some common use cases that demonstrate the versatility and power of Hub repositories: + +### For AI researchers and enthusiasts + +Researchers can quickly deploy state-of-the-art models for experimentation without managing complex infrastructure. The Hub provides access to optimized implementations of popular models like Stable Diffusion, LLMs, and computer vision systems, allowing for rapid prototyping and iteration. This accessibility democratizes AI research by reducing the technical barriers to working with cutting-edge models. + +### For individual developers + +Individual developers benefit from the ability to experiment with different AI models and approaches without extensive setup time. The Hub provides an opportunity to learn from well-structured projects. Repos are designed to optimize resource usage, helping developers minimize costs while maximizing performance and potential earnings. + +### For enterprises and teams + +Enterprises and teams can accelerate their development cycle by using preconfigured repos instead of creating everything from scratch. The Hub reduces infrastructure complexity by providing standardized deployment configurations, allowing technical teams to focus on their core business logic rather than spending time configuring infrastructure and dependencies. + +## Join the community + +The RunPod Hub is more than just a list of repos—it's a community of AI builders sharing knowledge and innovation. + +By participating, you'll connect with other developers facing similar challenges and discover cutting-edge implementations that solve problems you might be struggling with. + +Whether you're deploying your first model or sharing your twentieth repo, the Hub provides both the infrastructure and community connections to help you succeed. diff --git a/docs/hub/publishing-guide.md b/docs/hub/publishing-guide.md new file mode 100644 index 00000000..433f1e74 --- /dev/null +++ b/docs/hub/publishing-guide.md @@ -0,0 +1,354 @@ +--- +title: "Publishing guide" +description: "Learn how to configure your repository for the RunPod Hub with hub.json and tests.json files, including metadata, deployment settings, and test specifications." +sidebar_position: 3 +--- + +# Hub publishing guide + +:::info + +The RunPod Hub is currently in **beta**. We're actively developing features and gathering user feedback to improve the experience. Please [join our discord](https://discord.gg/runpod) if you'd like to provide feedback. + +::: + +Learn how to publish your repositories to the [RunPod Hub](https://www.runpod.io/console/hub), including how to configure your repository with the required `hub.json` and `tests.json` files. + +Screenshot of the RunPod Hub repo setup UI + +## How to publish your repo + +Follow these steps to add your repository to the Hub: + +1. Navigate to the [Hub page](https://www.runpod.io/console/hub) in the RunPod console. +2. Under **Add your repo** click **Get Started**. +3. Enter your GitHub repo URL. +4. Follow the UI steps to add your repo to the Hub. + +The Hub UI will walk you through how to: + +1. Create your `hub.json` and `tests.json` files. +2. Ensure your repository contains a `handler.py`, `Dockerfile`, and `README.md` file (in either the `.runpod` or root directory). +3. Create a new GitHub release (the Hub indexes releases, not commits). +4. (Optional) Add a RunPod Hub badge into your GitHub `README.md` file, so that users can instantly deploy your repo from GitHub. + +After all the necessary files are in place and a release has been created, your repo will be marked "Pending" during building/testing. After testing is complete, the RunPod team will manually review the repo for publication. + +## Update a repo + +To update your repo on the Hub, just **create a new GitHub release**, and the Hub listing will be automatically indexed and built (usually within an hour). + +## Required files + +Aside from a working [Serverless implementation](/serverless/overview), every Hub repo requires two additional configuration files: + +1. `hub.json` - Defines metadata and deployment settings for your repo. +2. `tests.json` - Specifies how to test your repo. + +These files should be placed in the `.runpod` directory at the root of your repository. This directory takes precedence over the root directory, allowing you to override common files like `Dockerfile` and `README.md` specifically for the Hub. + +## hub.json reference + +The `hub.json` file defines how your listing appears and functions in the Hub. + +You can build your `hub.json` from scratch, or use [this template](#hubjson-template) as a starting point. + +### General metadata + +| Field | Description | Required | Values | +| --- | --- | --- | --- | +| `title` | Name of your tool | Yes | String | +| `description` | Brief explanation of functionality | Yes | String | +| `type` | Deployment type | Yes | `"serverless"` | +| `category` | Tool category | Yes | `"audio"`, `"embedding"`, `"language"`, `"video"`, or `"image"` | +| `iconUrl` | URL to tool icon | No | Valid URL | +| `config` | RunPod configuration | Yes | Object ([see below](#runpod-configuration)) | + +### RunPod configuration + +| Field | Description | Required | Values | +| --- | --- | --- | --- | +| `runsOn` | Machine type | Yes | `"GPU"` or `"CPU"` | +| `containerDiskInGb` | Container disk space allocation | Yes | Integer (GB) | +| `cpuFlavor` | CPU configuration | Only if `runsOn` is `"CPU"` | Valid CPU flavor string. For a complete list of available CPU flavors, see [CPU Types](https://docs.runpod.io/references/cpu-types) | +| `gpuCount` | Number of GPUs | Only if `runsOn` is `"GPU"` | Integer | +| `gpuIds` | GPU pool specification | Only if `runsOn` is `"GPU"` | Comma-separated pool IDs (e.g., `"ADA_24"`) or GPU IDs (e.g., `"RTX A4000"`) with optional GPU ID negations (e.g., `"-NVIDIA RTX 4090"`). For a complete list of available GPUs, see [GPU Types](https://docs.runpod.io/references/gpu-types). | +| `allowedCudaVersions` | Supported CUDA versions | No | Array of version strings | +| `env` | Environment variable definitions | No | Object ([see below](#environment-variables)) | +| `presets` | Default environment variable values | No | Object ([see below](#presets)) | + +### Environment variables + +Environment variables can be defined in several ways: + +1. **Static variables**: Direct value assignment. For example: + + ```json + { + "key": "API_KEY", + "value": "default-api-key-value" + } + ``` + +2. **String inputs**: User-entered text fields. For example: + + ```json + { + "key": "MODEL_PATH", + "input": { + "name": "Model path", + "type": "string", + "description": "Path to the model weights on disk", + "default": "/models/stable-diffusion-v1-5", + "advanced": false + } + } + ``` + +3. **Hugging Face inputs:** Fields for model selection from Hugging Face Hub. For example: + + ```json + { + "key": "HF_MODEL", + "input": { + "type": "huggingface", + "name": "Hugging Face Model", + "description": "Model organization/name as listed on Huggingface Hub", + "default": "runwayml/stable-diffusion-v1-5", + } + } + ``` + +4. **Option inputs**: User selected option fields. For example: + + ```json + { + "key": "PRECISION", + "input": { + "name": "Model precision", + "type": "string", + "description": "The numerical precision for model inference", + "options": [ + {"label": "Full Precision (FP32)", "value": "fp32"}, + {"label": "Half Precision (FP16)", "value": "fp16"}, + {"label": "8-bit Quantization", "value": "int8"} + ], + "default": "fp16" + } + } + ``` + +5. **Number Inputs**: User-entered numeric fields. For example: + + ```json + { + "key": "MAX_TOKENS", + "input": { + "name": "Maximum tokens", + "type": "number", + "description": "Maximum number of tokens to generate", + "min": 32, + "max": 4096, + "default": 1024 + } + } + ``` + +6. **Boolean Inputs**: User-toggled boolean fields. For example: + + ```json + { + "key": "USE_FLASH_ATTENTION", + "input": { + "type": "boolean", + "name": "Flash attention", + "description": "Enable Flash Attention for faster inference on supported GPUs", + "default": true, + "trueValue": "true", + "falseValue": "false" + } + } + ``` + +Advanced options will be hidden by default. Hide an option by setting: `"advanced": true` . + +### Presets + +Presets allow you to define groups of default environment variable values. When a user deploys your repo, they'll be offered a dropdown menu with any preset options you've defined. + +Here are some example presets: + +```json +"presets": [ + { + "name": "Quality Optimized", + "defaults": { + "MODEL_NAME": "runpod-stable-diffusion-xl", + "INFERENCE_MODE": "quality", + "BATCH_SIZE": 1, + "ENABLE_CACHING": false, + "USE_FLASH_ATTENTION": true + } + }, + { + "name": "Performance Optimized", + "defaults": { + "MODEL_NAME": "runpod-stable-diffusion-v1-5", + "INFERENCE_MODE": "fast", + "BATCH_SIZE": 8, + "ENABLE_CACHING": true, + "USE_FLASH_ATTENTION": true + } + } +] +``` + +## hub.json template + +Here’s an example `hub.json` file that you can use as a starting point: + +```json title="hub.json" +{ + "title": "Your Tool's Name", + "description": "A brief explanation of what your tool does", + "type": "serverless", + "category": "language", + "iconUrl": "https://your-icon-url.com/icon.png", + + "config": { + "runsOn": "GPU", + "containerDiskInGb": 20, + + "gpuCount": 1, + "gpuIds": "RTX A4000,-NVIDIA RTX 4090", + "allowedCudaVersions": [ + "12.8", "12.7", "12.6", "12.5", "12.4", + "12.3", "12.2", "12.1", "12.0" + ], + + "presets": [ + { + "name": "Preset Name", + "defaults": { + "STRING_ENV_VAR": "value1", + "INT_ENV_VAR": 10, + "BOOL_ENV_VAR": true + } + } + ], + + "env": [ + { + "key": "STATIC_ENV_VAR", + "value": "static_value" + }, + { + "key": "STRING_ENV_VAR", + "input": { + "name": "User-friendly Name", + "type": "string", + "description": "Description of this input", + "default": "default value", + "advanced": false + } + }, + { + "key": "OPTION_ENV_VAR", + "input": { + "name": "Select Option", + "type": "string", + "description": "Choose from available options", + "options": [ + {"label": "Option 1", "value": "value1"}, + {"label": "Option 2", "value": "value2"} + ], + "default": "value1" + } + }, + { + "key": "INT_ENV_VAR", + "input": { + "name": "Numeric Value", + "type": "number", + "description": "Enter a number", + "min": 1, + "max": 100, + "default": 50 + } + }, + { + "key": "BOOL_ENV_VAR", + "input": { + "type": "boolean", + "name": "Enable Feature", + "description": "Toggle this feature on/off", + "default": false, + "trueValue": "enabled", + "falseValue": "disabled" + } + } + ] + } +} + +``` + +## tests.json reference + +The `tests.json` file defines test cases to validate your tool's functionality. Tests are executed during the build step after [a release has been created](#publish-your-repo-to-the-runpod-hub). A test is considered valid by the Hub if the endpoint returns a [200 response](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/200). + +You can build your `tests.json` from scratch, or use [this template](#testsjson-template) as a starting point. + +### Test cases + +Each test case should include: + +| Field | Description | Required | Values | +| --- | --- | --- | --- | +| `name` | Test identifier | Yes | String | +| `input` | Raw job input payload | Yes | Object | +| `timeout` | Max execution time | No | Integer (milliseconds) | + +### Test environment configuration + +| Field | Description | Required | Values | +| --- | --- | --- | --- | +| `gpuTypeId` | GPU type for testing | Only for GPU tests | Valid GPU ID | +| `gpuCount` | Number of GPUs | Only for GPU tests | Integer | +| `cpuFlavor` | CPU configuration for testing | Only for CPU tests | Valid CPU flavor string | +| `env` | Test environment variables | No | Array of key-value pairs | +| `allowedCudaVersions` | Supported CUDA versions | No | Array of version strings | + +## tests.json template + +Here’s an example `tests.json` file that you can use as a starting point: + +```json title="tests.json" +{ + "tests": [ + { + "name": "test_case_name", + "input": { + "param1": "value1", + "param2": "value2" + }, + "timeout": 10000 + } + ], + "config": { + "gpuTypeId": "NVIDIA GeForce RTX 4090", + "gpuCount": 1, + "env": [ + { + "key": "TEST_ENV_VAR", + "value": "test_value" + } + ], + "allowedCudaVersions": [ + "12.7", "12.6", "12.5", "12.4", + "12.3", "12.2", "12.1", "12.0", "11.7" + ] + } +} + +``` \ No newline at end of file diff --git a/docs/overview.md b/docs/overview.md index 7b772a23..e4d14f19 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -29,9 +29,9 @@ Use Serverless to: Get started with Serverless: +- [Deploy a preconfigured endpoint from the RunPod Hub.](/hub/overview) - [Build a custom Serverless worker.](/serverless/get-started) - [Run any LLM as an endpoint using vLLM workers.](/serverless/vllm/get-started) -- [Tutorial: Deploy a Serverless worker with Stable Diffusion.](/tutorials/serverless/run-your-first) ## Pods diff --git a/docs/pods/choose-a-pod.md b/docs/pods/choose-a-pod.md index c08f170a..2e46f654 100644 --- a/docs/pods/choose-a-pod.md +++ b/docs/pods/choose-a-pod.md @@ -1,71 +1,73 @@ --- title: Choose a Pod -description: "Choose the right Pod instance for your RunPod deployment by considering VRAM, RAM, vCPU, and storage, both Temporary and Persistent, to ensure optimal performance and efficiency." +description: "Select the optimal Pod configuration for your needs by evaluating GPU requirements, memory needs, and storage specifications to ensure peak performance for your workloads." sidebar_position: 3 --- -Selecting the appropriate Pod instance is a critical step in planning your RunPod deployment. The choice of VRAM, RAM, vCPU, and storage, both Temporary and Persistent, can significantly impact the performance and efficiency of your project. +# Choosing the right Pod -This page gives guidance on how to choose your Pod configuration. However, these are general guidelines. Keep your specific requirements in mind and plan accordingly. +Selecting the appropriate Pod configuration is a crucial step in maximizing performance and efficiency for your specific workloads. This guide will help you understand the key factors to consider when choosing a Pod that meets your requirements. -### Overview +## Understanding your workload needs -It's essential to understand the specific needs of your model. You can normally find detailed information in the model card’s description on platforms like Hugging Face or in the `config.json` file of your model. +Before selecting a Pod, take time to analyze your specific project requirements. Different applications have varying demands for computing resources: -There are tools that can help you assess and calculate your model’s specific requirements, such as: +- Machine learning models require sufficient VRAM and powerful GPUs. +- Data processing tasks benefit from higher CPU core counts and RAM. +- Rendering workloads need both strong GPU capabilities and adequate storage. -- [Hugging Face's Model Memory Usage Calculator](https://huggingface.co/spaces/hf-accelerate/model-memory-usage) -- [Vokturz’ Can it run LLM calculator](https://huggingface.co/spaces/Vokturz/can-it-run-llm) -- [Alexander Smirnov’s VRAM Estimator](https://vram.asmirnov.xyz) +For machine learning models specifically, check the model's documentation on platforms like Hugging Face or review the `config.json` file to understand its resource requirements. -Using these resources should give you a clearer picture of what to look for in a Pod. +## Resource assessment tools -When transitioning to the selection of your Pod, you should focus on the following main factors: +There are several online tools that can help you estimate your resource requirements: -- **GPU** -- **VRAM** -- **Disk Size** +- [Hugging Face's Model Memory Usage Calculator](https://huggingface.co/spaces/hf-accelerate/model-memory-usage) provides memory estimates for transformer models +- [Vokturz' Can it run LLM calculator](https://huggingface.co/spaces/Vokturz/can-it-run-llm) helps determine if your hardware can run specific language models +- [Alexander Smirnov's VRAM Estimator](https://vram.asmirnov.xyz) offers GPU memory requirement approximations -Each of these components plays a crucial role in the performance and efficiency of your deployment. By carefully considering these elements along with the specific requirements of your project as shown in your initial research, you will be well-equipped to determine the most suitable Pod instance for your needs. +## Key factors to consider -### GPU +### GPU selection -The type and power of the GPU directly affect your project's processing capabilities, especially for tasks involving graphics processing and machine learning. +The GPU is the cornerstone of computational performance for many workloads. When selecting your GPU, consider the architecture that best suits your software requirements. NVIDIA GPUs with CUDA support are essential for most machine learning frameworks, while some applications might perform better on specific GPU generations. Evaluate both the raw computing power (CUDA cores, tensor cores) and the memory bandwidth to ensure optimal performance for your specific tasks. -### Importance +For machine learning inference, a mid-range GPU might be sufficient, while training large models requires more powerful options. Check framework-specific recommendations, as PyTorch, TensorFlow, and other frameworks may perform differently across GPU types. -The GPU in your Pod plays a vital role in processing complex algorithms, particularly in areas like data science, video processing, and machine learning. A more powerful GPU can significantly speed up computations and enable more complex tasks. +### VRAM requirements -### Selection criteria +VRAM (video RAM) is the dedicated memory on your GPU that stores data being processed. Insufficient VRAM can severely limit your ability to work with large models or datasets. -- **Task Requirements**: Assess the intensity and nature of the GPU tasks in your project. -- **Compatibility**: Ensure the GPU is compatible with your software and frameworks. -- **Energy Efficiency**: Consider the power consumption of the GPU, especially for long-term deployments. +For machine learning models, VRAM requirements increase with model size, batch size, and input dimensions. Large language models often need substantial VRAM—models with billions of parameters may require 24GB or more. When working with computer vision tasks, higher resolution images or videos consume more VRAM, especially during training phases. -### VRAM +Remember that VRAM requirements often exceed what's theoretically needed due to memory fragmentation and framework overhead. It's advisable to have at least 20% more VRAM than your baseline calculations suggest. -VRAM (Video RAM) is crucial for tasks that require heavy graphical processing and rendering. It is the dedicated memory used by your GPU to store image data that is displayed on your screen. +### Storage configuration -### Importance +Your storage configuration affects both data access speeds and your ability to maintain persistent workspaces. RunPod offers both temporary and persistent [storage options](/pods/storage/types): -VRAM is essential for intensive tasks. It serves as the memory for the GPU, allowing it to store and access data quickly. More VRAM can handle larger textures and more complex graphics, which is crucial for high-resolution displays and advanced 3D rendering. +- The container volume provides temporary storage that's cleared when your Pod stops. This is ideal for caching, temporary files, or data that doesn't need to persist between sessions. For best performance, keep frequently accessed files on the container disk. +- The disk volume maintains your data even when Pods are stopped or restarted. This is essential for long-term projects, datasets, trained models, and workspace configurations. +- For large datasets, consider attaching [network volumes](/pods/storage/create-network-volumes) that can be shared across multiple Pods. -### Selection criteria +When determining storage needs, account for raw data size, intermediate files generated during processing, and space for output results. For data-intensive workloads, prioritize both capacity and speed to avoid bottlenecks. -- **Graphics Intensity**: More VRAM is needed for graphically intensive tasks such as 3D rendering, gaming, or AI model training that involves large datasets. -- **Parallel Processing Needs**: Tasks that require simultaneous processing of multiple data streams benefit from more VRAM. -- **Future-Proofing**: Opting for more VRAM can make your setup more adaptable to future project requirements. +## Balancing performance and cost -### Storage +When selecting a Pod, balancing performance needs with budget constraints is important. Consider the following approaches: -Adequate storage, both temporary and persistent, ensures smooth operation and data management. +1. Use right-sized resources for your workload. For development and testing, a smaller Pod configuration may be sufficient, while production workloads might require more powerful options. -### Importance +2. Take advantage of spot instances for non-critical or fault-tolerant workloads to reduce costs. For consistent availability needs, on-demand or reserved Pods provide greater reliability. -Disk size, including both temporary and persistent storage, is critical for data storage, caching, and ensuring that your project has the necessary space for its operations. +3. For extended usage, explore RunPod's [savings plans](/pods/savings-plans) to optimize your spending while ensuring access to the resources you need. -### Selection criteria +## Next steps -- **Data Volume**: Estimate the amount of data your project will generate and process. -- **Speed Requirements**: Faster disk speeds can improve overall system performance. -- **Data Retention Needs**: Determine the balance between temporary (volatile) and persistent (non-volatile) storage based on your data retention policies. +Once you've determined your Pod specifications: + +1. [Learn how to deploy a Pod](/get-started) +2. [Manage your Pods](/pods/manage-pods). +3. [Connect to your Pod](/pods/connect-to-a-pod). + +Remember that you can always deploy a new Pod if you requirements evolve. Start with a configuration that meets your immediate needs, then scale up or down based on actual usage patterns and performance metrics. diff --git a/docs/pods/overview.md b/docs/pods/overview.md index 0a017d2d..2ceef50b 100644 --- a/docs/pods/overview.md +++ b/docs/pods/overview.md @@ -1,51 +1,73 @@ --- title: Overview -description: "Run containers as Pods with a container registry, featuring compatible architectures, Ubuntu Linux, and persistent storage, with customizable options for GPU type, system disk size, and more." +description: "Run container-based workloads as pods with GPU acceleration, persistent storage, and customizable configurations to meet your computing needs." sidebar_position: 1 --- -Pods are running container instances. -You can pull an instance from a container registry such as Docker Hub, GitHub Container Registry, Amazon Elastic Container Registry, or another compatible registry. +# Pods overview + +Pods are containerized computing environments that provide on-demand access to GPU and CPU resources. Each Pod runs as an isolated container instance that you can customize for your specific workloads. + +## What are Pods? + +Pods are virtual computing environments based on container technology. They provide flexible access to GPU and CPU resources within isolated runtime environments. Each Pod comes with persistent storage options and customizable configurations, all accessible through web-based interfaces. When you deploy a Pod, you're essentially spinning up a container with your preferred software stack, connected to the hardware resources required for your workload. + +## Key components + +Each Pod consists of these core components: + +- **Container environment**: An Ubuntu Linux-based container that can run almost any compatible software. +- **Container volume**: Houses the operating system and temporary storage (volatile and cleared when the Pod stops). +- **Disk volume**: Permanent storage preserved for the duration of your Pod's lease. +- **Hardware resources**: Allocated vCPU, system RAM, and optional GPUs based on your selection. +- **Network connectivity**: A proxy connection enabling web access to any exposed port on your container. +- **Unique identifier**: Each Pod receives a dynamic ID (e.g., `2s56cp0pof1rmt`) for management and access. + +## Storage options + +Pods offer three [storage types](/storage/types) to match different use cases: + +1. **Container volume**: Temporary storage within the container itself. This storage is cleared when the Pod stops. +2. **Disk volume**: Storage that persists between Pod restarts, similar to an attached hard drive. +3. **Network volume**: Portable storage that can be moved between machines and persists even after Pod deletion. + +## Deployment options + +You can deploy Pods in several ways: + +- [From a template](https://docs.runpod.io/pods/templates/overview): Pre-configured environments for quick setup of common workflows. +- **Custom containers**: Pull from any compatible container registry such as Docker Hub, GitHub Container Registry, or Amazon ECR. +- **Custom images**: Build and deploy your own container images. :::note -When building an image for RunPod on a Mac (Apple Silicon), use the flag `--platform linux/amd64` to ensure your image is compatible with the platform. This flag is necessary because RunPod currently only supports the `linux/amd64` architecture. +When building images for RunPod on Apple Silicon, use the flag `--platform linux/amd64` to ensure compatibility. RunPod currently only supports the `linux/amd64` architecture. ::: -### Understanding Pod components and configuration +## Accessing your pod + +Once deployed, you can access your Pod through: -A Pod is a server container created by you to access the hardware, with a dynamically generated assigned identifier. -For example, `2s56cp0pof1rmt` identifies the instance. +- **SSH**: Direct command-line access for development and management. +- **Web proxy**: HTTP access to exposed web services via URLs in the format `https://[pod-id]-[port].proxy.runpod.net`. +- **API**: Programmatic access and control through the RunPod API. -A Pod comprises a container volume with the operating system and temporary storage, a disk volume for permanent storage, an Ubuntu Linux container, allocated vCPU and system RAM, optional GPUs or CPUs for specific workloads, a pre-configured template for easy software access, and a proxy connection for web access. +## Customization options -Each Pod encompasses a variety of components: +Pods offer extensive customization to match your specific requirements. -- A container volume that houses the operating system and temporary storage. - - This storage is volatile and will be lost if the Pod is halted or rebooted. -- A disk volume for permanent storage, preserved for the duration of the Pod's lease, akin to a hard disk. - - This storage is persistent and will be available even if the Pod is halted or rebooted. -- Network storage, similar to a volume but can be moved between machines. - - When using network storage, you can only delete the Pod. -- An Ubuntu Linux container, capable of running almost any software that can be executed on Ubuntu. -- Assigned vCPU and system RAM dedicated to the container and any processes it runs. -- Optional GPUs or CPUs, tailored for specific workloads like CUDA or AI/ML tasks, though not mandatory for starting the container. -- A pre-configured template that automates the installation of software and settings upon Pod creation, offering straightforward, one-click access to various packages. -- A proxy connection for web access, allowing connectivity to any open port on the container. - - For example, `https://[pod-id]-[port number].proxy.runpod.net`, or `https://2s56cp0pof1rmt-7860.proxy.runpod.net/`). +You can select your preferred [GPU type](/references/gpu-types) and quantity, adjust system disk size, and specify your container image. -To get started, see how to [Choose a Pod](/pods/choose-a-pod) then see the instructions on [Manage Pods](/pods/manage-pods). +Additionally, you can configure custom start commands, set [environment variables](/pods/references/environment-variables), define [exposed HTTP/TCP ports](/pods/configuration/expose-ports), and implement various [storage configurations](/storage/types) to optimize your Pod for your specific workload. -## Learn more +## Getting started -You can jump straight to a running Pod by starting from a [template](/pods/templates/overview). For more customization, you can configure the following: +To start using Pods: -- [GPU Type](/references/gpu-types) and quantity -- System Disk Size -- Start Command -- [Environment Variables](/pods/references/environment-variables) -- [Expose HTTP/TCP ports](/pods/configuration/expose-ports) -- [Persistent Storage Options](/category/storage) +1. [Deploy your first Pod](/get-started) using this tutorial. +2. [Choose a Pod](/pods/choose-a-pod) based on your resource needs. +3. [Learn how to start, stop, and terminate your Pod](/pods/manage-pods). +4. [Connect to your Pod](/pods/connect-to-a-pod) using SSH, JupyterLab, or the web terminal. -To get started, see how to [Choose a Pod](/pods/choose-a-pod) then see the instructions on [Manage Pods](/pods/manage-pods). +For quicker deployment, start with a [template](/pods/templates/overview) that includes pre-configured environments for common workflows. diff --git a/docs/serverless/overview.md b/docs/serverless/overview.md index a9d41f86..7a1c25a6 100644 --- a/docs/serverless/overview.md +++ b/docs/serverless/overview.md @@ -50,23 +50,21 @@ runpod.serverless.start({"handler": handler}) # Required ## Deployment options -RunPod Serverless offers three ways to deploy your workloads, each designed for different use cases: +RunPod Serverless offers several ways to deploy your workloads, each designed for different use cases: -### Quick deploy an endpoint +### RunPod Hub -**Best for**: Deploying preconfigured AI models with minimal effort (no coding required). +**Best for**: Instantly deploying preconfigured AI models. -You can deploy a Serverless endpoint in minutes from the RunPod console: +You can deploy a Serverless endpoint from a repo in the [RunPod Hub](/hub/overview) in seconds: -1. Go to the [Serverless page](https://www.runpod.io/console/serverless) in the RunPod console. -2. Under **Quick Deploy**, browse the collection of preconfigured workers and select one that matches your needs. -3. Click **Configure**. Depending on your choice, you may need to enter a [Hugging Face model](https://huggingface.co/models). -4. Choose a **Worker Configuration**. Quick deploys are preconfigured with -5. Click the **Create Endpoint**. +1. Navigate to the [Hub page](https://www.runpod.io/console/hub) in the RunPod console. +2. Browse the collection and select a repo that matches your needs. +3. Review the repo details, including hardware requirements and available configuration options to ensure compatibility with your use case. +4. Click the **Deploy** button in the top-right of the repo page. You can also use the dropdown menu to deploy an older version. +5. Click **Create Endpoint** -You'll be redirected to your new endpoint. Now you're ready to [send a request](/serverless/endpoints/send-requests). - -[Quick deploy an endpoint →](https://www.runpod.io/console/serverless) +[Deploy a repo from the RunPod Hub →](https://www.runpod.io/console/hub) ### Deploy a vLLM worker diff --git a/sidebars.js b/sidebars.js index 8ca7d8c2..5b936a95 100644 --- a/sidebars.js +++ b/sidebars.js @@ -32,6 +32,17 @@ module.exports = { }, ], }, + { + type: "category", + label: "Hub [beta]", + collapsible: false, + items: [ + { + type: "autogenerated", + dirName: "hub", + }, + ], + }, { type: "category", label: "Pods", @@ -43,6 +54,7 @@ module.exports = { }, ], }, + { type: "category", label: "Instant Clusters", diff --git a/static/img/docs/hub-homepage.png b/static/img/docs/hub-homepage.png new file mode 100644 index 00000000..2ef34d72 Binary files /dev/null and b/static/img/docs/hub-homepage.png differ diff --git a/static/img/docs/hub-publish-page.png b/static/img/docs/hub-publish-page.png new file mode 100644 index 00000000..a71f4ed9 Binary files /dev/null and b/static/img/docs/hub-publish-page.png differ