Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions community-solutions/ssh-password-migration/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ icon: "arrows-rotate"

## What are these tools?

These are simple bash and Python scripts that solve a critical problem: **migrating data between Runpod instances when you need to move Pods** (e.g., when your Pod gets stuck with zero GPUs or you need to switch to a different instance).
These are simple bash and Python scripts that solve a critical problem: **migrating data between Runpod instances when you need to move Pods** (e.g., when your original GPU becomes [unavailable on your Pod's machine](/pods/manage-pods#pod-migration) or you need to switch to a different instance).

<Note>
Check the repository for additional features, updates, and documentation: [github.com/justinwlin/Runpod-SSH-Password](https://github.com/justinwlin/Runpod-SSH-Password)
Expand All @@ -15,7 +15,7 @@ Check the repository for additional features, updates, and documentation: [githu
## The problem it solves

When Runpod users encounter issues like:
- Pod stuck with **zero GPUs allocated**.
- Pod's **original GPU becomes unavailable** on its physical machine.
- Need to **migrate to a different GPU type**.
- Want to **transfer data between Pods**.
- Need to **backup data before terminating a Pod**.
Expand Down
28 changes: 27 additions & 1 deletion pods/manage-pods.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ If your Pod has a [network volume](/pods/storage/create-network-volumes) attache

When a Pod is stopped, data in the container volume is cleared, but data in the `/workspace` directory is preserved. To learn more about how Pod storage works, see [Storage overview](/pods/storage/types).

By stopping a Pod you are effectively releasing the GPU on the machine, and you may be reallocated 0 GPUs when you start the Pod again. For more info, see the [FAQ](/references/faq#why-do-i-have-zero-gpus-assigned-to-my-pod%3F).
By stopping a Pod you are effectively releasing the GPU on the machine, and your original GPU may become unavailable when you restart the Pod. Runpod provides automatic migration options to help you get back to work quickly. For more info, see [Pod migration](#pod-migration).

<Warning>
After a Pod is stopped, you will still be charged for its [disk volume](/pods/storage/types#disk-volume) storage. If you don't need to retain your Pod environment, you should terminate it completely.
Expand Down Expand Up @@ -323,3 +323,29 @@ Pods provide two types of logs to help you monitor and troubleshoot your workloa
- **System logs** provide detailed information about your Pod's lifecycle, such as container creation, image download, extraction, startup, and shutdown events.

To view your logs, open the [Pods page](https://www.console.runpod.io/pods), expand your Pod, and click the **Logs** button. This gives you real-time access to both container and system logs, making it easy to diagnose issues or monitor your Pod's activity.

## Pod migration

When you deploy a Pod, your Pod is locked to a single physical machine in a datacenter. As long as you keep your Pod running you'll maintain access to it, and your instance charges will stay the same.

However, if you stop your Pod, it immediately becomes available for other users to rent. If you try to start your Pod again, but your machine is now full (i.e. someone rented all 4-8 GPUs), you'll be offered the option to migrate your Pod data to a new machine.

When this happens, you have three options:

1. **Automatically migrate Pod data**: This l spins-up a new Pod with the same specs as the current one and migrates user data automatically so users can get back to work quickly. This 1-click migration process will find a new machine with the requested GPU type, spin up the instance, and migrate your data automatically from your old Pod into a new Pod.

2. **Start Pod with CPUs**: If you don't require GPUs immediately, you can instead choose to start your Pod with CPUs only, so you can still access your data or even manually migrate your data yourself.

3. **Do nothing**: If you don't want to migrate your data, you can simply do nothing and wait for your Pod machine to become available again. There is no gaurantee for how long this might take—try waiting a few minutes before trying again.

<Warning>

If you migrate your Pod data, **your new Pod will have a new IP address**. This may affect your application if:

- You have a pod ID hardcoded in an API call.
- You have a proxy URL hardcoded: e.g. `b63b243b47bd340becc72fbe9b3e642c.proxy.runpod.net`
- You have a firewall or VPN setup with a specific Pod ID in it.
- You have a firewall or VPN setup with a specific Pod IP address in it.
- You are using a specific URL for your server (when you start a new Pod, you will get a new URL for the UI or server you've setup, etc).

</Warning>
10 changes: 5 additions & 5 deletions references/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -138,17 +138,17 @@ We don't currently support Windows. We want to do this in the future, but we do

Runpod needs to provide you with reliable servers. All of our listed servers must meet minimum reliability, and most are running in a data center! However, if you want the highest level of reliability and security, use Secure Cloud. Runpod calculates server reliability by maintaining a heartbeat with each server in real-time.

### Why do I have zero GPUs assigned to my Pod?
### Why am I being asked to migrate my Pod?

Most of our machines have between 4 and 8 GPUs per physical machine. When you start a Pod, it is locked to a specific physical machine. If you keep it running (On-Demand), then that GPU cannot be taken from you. However, if you stop your Pod, it becomes available for a different user to rent. When you want to start your Pod again, your specific machine may be wholly occupied. In this case, we give you the option to spin up your Pod with zero GPUs so you can retain access to your data.
In most cases, our machines have 4-8 GPUs per physical machine. When you start a Pod, this locks your Pod to that specific physical machine. As long as you keep your Pod running you will maintain access to it, and your instance charges will stay the same.

Remember that this does not mean there are no more GPUs of that type available, just none on the physical machine that specific Pod is locked to. Note that transfer Pods have limited computing capabilities, so transferring files using a UI may be difficult, and you may need to resort to terminal access or cloud sync options.
However, **if you stop your Pod, it immediately becomes available for other users to rent**. If you try to start your Pod again, but your machine is now full (i.e. someone rented all 4-8 GPUs), you'll be offered the option to migrate your Pod data to a new machine.

If you want to avoid this, using network volumes is the best choice. [Learn how to use them here](/pods/storage/create-network-volumes).
For more information, see [Pod migration](/pods/manage-pods#pod-migration).

#### What are Network Volumes?

Network volumes allow you to share data between Pods and generally be more mobile with your important data. This feature is only available in specific secure cloud data centers, but we are actively rolling it out to more and more of our secure cloud footprint. If you use network volumes, you should rarely run into situations where you cannot use your data with a GPU without a file transfer.
Network volumes allow you to share data between Pods and generally be more mobile with your important data. This feature is only available in certain secure cloud data centers, but we are actively rolling it out to more and more of our secure cloud footprint. If you use network volumes, you should rarely run into situations where you cannot use your data with a GPU without a file transfer.

[Read about it here](/pods/storage/create-network-volumes).

Expand Down