PyTorchKorea · 7SH7 · Oct 19, 2025
diff --git a/recipes_source/compiling_optimizer.rst b/recipes_source/compiling_optimizer.rst
@@ -1,25 +1,24 @@
-(beta) Compiling the optimizer with torch.compile
+(beta) torch.compile로 옵티마이저 컴파일하기
 ==========================================================================================
 
-**Author:** `Michael Lazos <https://github.com/mlazos>`_
+**저자:** `Michael Lazos <https://github.com/mlazos>`_
+**번역** `김승환 <https://github.com/7SH7>`_
 
-The optimizer is a key algorithm for training any deep learning model.
-Since it is responsible for updating every model parameter, it can often
-become the bottleneck in training performance for large models. In this recipe, 
-we will apply ``torch.compile`` to the optimizer to observe the GPU performance 
-improvement.
+옵티마이저는 딥러닝 모델을 훈련하는 핵심 알고리즘입니다.
+모든 모델 파라미터를 업데이트하는 역할을 하기 때문에, 대규모 모델에서는 종종 훈련 성능의 병목이 될 수 있습니다.
+이 레시피에서는 옵티마이저에 ``torch.compile``을 적용하여 GPU 성능 향상을 관찰해보겠습니다.
 
 .. note::
 
-   This tutorial requires PyTorch 2.2.0 or later.
+    이 튜토리얼은 PyTorch 2.2.0 이상이 필요합니다.
 
-Model Setup
+모델 설정
 ~~~~~~~~~~~~~~~~~~~~~
-For this example, we'll use a simple sequence of linear layers.
-Since we are only benchmarking the optimizer, the choice of model doesn't matter
-because optimizer performance is a function of the number of parameters.
+이 예제에서는 간단한 선형 계층의 시퀀스를 사용할 것입니다.
+우리는 옵티마이저의 성능만 벤치마킹할 것이기 때문에, 모델의 선택은 중요하지 않습니다.
+옵티마이저의 성능은 파라미터의 수에 따라 달라지기 때문입니다.
 
-Depending on what machine you are using, your exact results may vary.
+사용하는 머신에 따라 정확한 결과는 다를 수 있습니다.
 
 .. code-block:: python
 
@@ -32,19 +31,17 @@ Depending on what machine you are using, your exact results may vary.
    output = model(input)
    output.sum().backward()
 
-Setting up and running the optimizer benchmark
+옵티마이저 벤치마크 설정 및 실행
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In this example, we'll use the Adam optimizer
-and create a helper function to wrap the step()
-in ``torch.compile()``.
+이 예제에서는 Adam 옵티마이저를 사용하고, ``torch.compile()``에서 step()을 래핑하는 도우미 함수를 생성합니다.
 
 .. note::
 
    ``torch.compile`` is only supported on cuda devices with compute capability >= 7.0
 
 .. code-block:: python
 
-   # exit cleanly if we are on a device that doesn't support torch.compile
+   # torch.compile이 지원되지 않는 디바이스에서는 깔끔하게 종료합니다.
    if torch.cuda.get_device_capability() < (7, 0):
        print("Exiting because torch.compile is not supported on this device.")
        import sys
@@ -59,7 +56,7 @@ in ``torch.compile()``.
        opt.step()
 
 
-   # Let's define a helpful benchmarking function:
+   # 유용한 벤치마킹 함수를 정의해봅시다:
    import torch.utils.benchmark as benchmark
 
 
@@ -70,7 +67,7 @@ in ``torch.compile()``.
        return t0.blocked_autorange().mean * 1e6
 
 
-   # Warmup runs to compile the function
+   # 함수를 컴파일하기 위한 웜업 실행
    for _ in range(5):
        fn()
 
@@ -82,13 +79,12 @@ in ``torch.compile()``.
    print(f"eager runtime: {eager_runtime}us")
    print(f"compiled runtime: {compiled_runtime}us")
 
-Sample Results:
+샘플 결과:
 
 * Eager runtime: 747.2437149845064us
 * Compiled runtime: 392.07384741178us
 
 See Also
 ~~~~~~~~~
 
-* For an in-depth technical overview, see
-`Compiling the optimizer with PT2 <https://dev-discuss.pytorch.org/t/compiling-the-optimizer-with-pt2/1669>`__
+* 심층적인 기술 개요를 위해서, `PT2로 옵티마이저 컴파일하기 <https://dev-discuss.pytorch.org/t/compiling-the-optimizer-with-pt2/1669>`__ 를 참조하세요.