Skip to content

Fix typos in docs and comments #11416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/animatediff.md
Original file line number Diff line number Diff line change
Expand Up @@ -966,7 +966,7 @@ pipe.to("cuda")
prompt = {
0: "A caterpillar on a leaf, high quality, photorealistic",
40: "A caterpillar transforming into a cocoon, on a leaf, near flowers, photorealistic",
80: "A cocoon on a leaf, flowers in the backgrond, photorealistic",
80: "A cocoon on a leaf, flowers in the background, photorealistic",
120: "A cocoon maturing and a butterfly being born, flowers and leaves visible in the background, photorealistic",
160: "A beautiful butterfly, vibrant colors, sitting on a leaf, flowers in the background, photorealistic",
200: "A beautiful butterfly, flying away in a forest, photorealistic",
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/ledits_pp.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ You can find additional information about LEDITS++ on the [project page](https:/
</Tip>

<Tip warning={true}>
Due to some backward compatability issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion.
Due to some backward compatibility issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion.
This issue is unlikely to have any noticeable effects on applied use-cases. However, we provide an alternative implementation that guarantees perfect inversion in a dedicated [GitHub repo](https://github.com/ml-research/ledits_pp).
</Tip>

Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/api/pipelines/wan.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ pipe = WanImageToVideoPipeline.from_pretrained(
image_encoder=image_encoder,
torch_dtype=torch.bfloat16
)
# Since we've offloaded the larger models alrady, we can move the rest of the model components to GPU
# Since we've offloaded the larger models already, we can move the rest of the model components to GPU
pipe.to("cuda")

image = load_image(
Expand Down Expand Up @@ -368,7 +368,7 @@ pipe = WanImageToVideoPipeline.from_pretrained(
image_encoder=image_encoder,
torch_dtype=torch.bfloat16
)
# Since we've offloaded the larger models alrady, we can move the rest of the model components to GPU
# Since we've offloaded the larger models already, we can move the rest of the model components to GPU
pipe.to("cuda")

image = load_image(
Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/using-diffusers/inference_with_lcm.md
Original file line number Diff line number Diff line change
Expand Up @@ -485,7 +485,7 @@ image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image).resize((1024, 1216))

adapter = T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16, varient="fp16").to("cuda")
adapter = T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16, variant="fp16").to("cuda")

unet = UNet2DConditionModel.from_pretrained(
"latent-consistency/lcm-sdxl",
Expand Down Expand Up @@ -551,7 +551,7 @@ image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image).resize((1024, 1024))

adapter = T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16, varient="fp16").to("cuda")
adapter = T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16, variant="fp16").to("cuda")

pipe = StableDiffusionXLAdapterPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
Expand Down
6 changes: 3 additions & 3 deletions docs/source/en/using-diffusers/pag.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,11 +154,11 @@ pipeline = AutoPipelineForInpainting.from_pretrained(
pipeline.enable_model_cpu_offload()
```

You can enable PAG on an exisiting inpainting pipeline like this
You can enable PAG on an existing inpainting pipeline like this

```py
pipeline_inpaint = AutoPipelineForInpaiting.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
pipeline = AutoPipelineForInpaiting.from_pipe(pipeline_inpaint, enable_pag=True)
pipeline_inpaint = AutoPipelineForInpainting.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
pipeline = AutoPipelineForInpainting.from_pipe(pipeline_inpaint, enable_pag=True)
```

This still works when your pipeline has a different task:
Expand Down
4 changes: 2 additions & 2 deletions examples/advanced_diffusion_training/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ Now we'll simply specify the name of the dataset and caption column (in this cas
```

You can also load a dataset straight from by specifying it's name in `dataset_name`.
Look [here](https://huggingface.co/blog/sdxl_lora_advanced_script#custom-captioning) for more info on creating/loadin your own caption dataset.
Look [here](https://huggingface.co/blog/sdxl_lora_advanced_script#custom-captioning) for more info on creating/loading your own caption dataset.

- **optimizer**: for this example, we'll use [prodigy](https://huggingface.co/blog/sdxl_lora_advanced_script#adaptive-optimizers) - an adaptive optimizer
- **pivotal tuning**
Expand Down Expand Up @@ -404,7 +404,7 @@ The advanced script now supports custom choice of U-net blocks to train during D
> In light of this, we're introducing a new feature to the advanced script to allow for configurable U-net learned blocks.

**Usage**
Configure LoRA learned U-net blocks adding a `lora_unet_blocks` flag, with a comma seperated string specifying the targeted blocks.
Configure LoRA learned U-net blocks adding a `lora_unet_blocks` flag, with a comma separated string specifying the targeted blocks.
e.g:
```bash
--lora_unet_blocks="unet.up_blocks.0.attentions.0,unet.up_blocks.0.attentions.1"
Expand Down
2 changes: 1 addition & 1 deletion examples/advanced_diffusion_training/README_flux.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ Now we'll simply specify the name of the dataset and caption column (in this cas
```

You can also load a dataset straight from by specifying it's name in `dataset_name`.
Look [here](https://huggingface.co/blog/sdxl_lora_advanced_script#custom-captioning) for more info on creating/loadin your own caption dataset.
Look [here](https://huggingface.co/blog/sdxl_lora_advanced_script#custom-captioning) for more info on creating/loading your own caption dataset.

- **optimizer**: for this example, we'll use [prodigy](https://huggingface.co/blog/sdxl_lora_advanced_script#adaptive-optimizers) - an adaptive optimizer
- **pivotal tuning**
Expand Down
2 changes: 1 addition & 1 deletion examples/amused/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Amused training

Amused can be finetuned on simple datasets relatively cheaply and quickly. Using 8bit optimizers, lora, and gradient accumulation, amused can be finetuned with as little as 5.5 GB. Here are a set of examples for finetuning amused on some relatively simple datasets. These training recipies are aggressively oriented towards minimal resources and fast verification -- i.e. the batch sizes are quite low and the learning rates are quite high. For optimal quality, you will probably want to increase the batch sizes and decrease learning rates.
Amused can be finetuned on simple datasets relatively cheaply and quickly. Using 8bit optimizers, lora, and gradient accumulation, amused can be finetuned with as little as 5.5 GB. Here are a set of examples for finetuning amused on some relatively simple datasets. These training recipes are aggressively oriented towards minimal resources and fast verification -- i.e. the batch sizes are quite low and the learning rates are quite high. For optimal quality, you will probably want to increase the batch sizes and decrease learning rates.

All training examples use fp16 mixed precision and gradient checkpointing. We don't show 8 bit adam + lora as its about the same memory use as just using lora (bitsandbytes uses full precision optimizer states for weights below a minimum size).

Expand Down
2 changes: 1 addition & 1 deletion examples/cogvideo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ Note that setting the `<ID_TOKEN>` is not necessary. From some limited experimen
> - The original repository uses a `lora_alpha` of `1`. We found this not suitable in many runs, possibly due to difference in modeling backends and training settings. Our recommendation is to set to the `lora_alpha` to either `rank` or `rank // 2`.
> - If you're training on data whose captions generate bad results with the original model, a `rank` of 64 and above is good and also the recommendation by the team behind CogVideoX. If the generations are already moderately good on your training captions, a `rank` of 16/32 should work. We found that setting the rank too low, say `4`, is not ideal and doesn't produce promising results.
> - The authors of CogVideoX recommend 4000 training steps and 100 training videos overall to achieve the best result. While that might yield the best results, we found from our limited experimentation that 2000 steps and 25 videos could also be sufficient.
> - When using the Prodigy opitimizer for training, one can follow the recommendations from [this](https://huggingface.co/blog/sdxl_lora_advanced_script) blog. Prodigy tends to overfit quickly. From my very limited testing, I found a learning rate of `0.5` to be suitable in addition to `--prodigy_use_bias_correction`, `prodigy_safeguard_warmup` and `--prodigy_decouple`.
> - When using the Prodigy optimizer for training, one can follow the recommendations from [this](https://huggingface.co/blog/sdxl_lora_advanced_script) blog. Prodigy tends to overfit quickly. From my very limited testing, I found a learning rate of `0.5` to be suitable in addition to `--prodigy_use_bias_correction`, `prodigy_safeguard_warmup` and `--prodigy_decouple`.
> - The recommended learning rate by the CogVideoX authors and from our experimentation with Adam/AdamW is between `1e-3` and `1e-4` for a dataset of 25+ videos.
>
> Note that our testing is not exhaustive due to limited time for exploration. Our recommendation would be to play around with the different knobs and dials to find the best settings for your data.
Expand Down
2 changes: 1 addition & 1 deletion examples/cogvideo/train_cogvideox_image_to_video_lora.py
Original file line number Diff line number Diff line change
Expand Up @@ -879,7 +879,7 @@ def prepare_rotary_positional_embeddings(


def get_optimizer(args, params_to_optimize, use_deepspeed: bool = False):
# Use DeepSpeed optimzer
# Use DeepSpeed optimizer
if use_deepspeed:
from accelerate.utils import DummyOptim

Expand Down
2 changes: 1 addition & 1 deletion examples/cogvideo/train_cogvideox_lora.py
Original file line number Diff line number Diff line change
Expand Up @@ -901,7 +901,7 @@ def prepare_rotary_positional_embeddings(


def get_optimizer(args, params_to_optimize, use_deepspeed: bool = False):
# Use DeepSpeed optimzer
# Use DeepSpeed optimizer
if use_deepspeed:
from accelerate.utils import DummyOptim

Expand Down
2 changes: 1 addition & 1 deletion examples/community/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4864,7 +4864,7 @@ python -m pip install intel_extension_for_pytorch
```
python -m pip install intel_extension_for_pytorch==<version_name> -f https://developer.intel.com/ipex-whl-stable-cpu
```
2. After pipeline initialization, `prepare_for_ipex()` should be called to enable IPEX accelaration. Supported inference datatypes are Float32 and BFloat16.
2. After pipeline initialization, `prepare_for_ipex()` should be called to enable IPEX acceleration. Supported inference datatypes are Float32 and BFloat16.

```python
pipe = AnimateDiffPipelineIpex.from_pretrained(base, motion_adapter=adapter, torch_dtype=dtype).to(device)
Expand Down
4 changes: 2 additions & 2 deletions examples/community/dps_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,13 +336,13 @@ def contributions(self, in_length, out_length, scale, kernel, kernel_width, anti
expanded_kernel_width = np.ceil(kernel_width) + 2

# Determine a set of field_of_view for each each output position, these are the pixels in the input image
# that the pixel in the output image 'sees'. We get a matrix whos horizontal dim is the output pixels (big) and the
# that the pixel in the output image 'sees'. We get a matrix whose horizontal dim is the output pixels (big) and the
# vertical dim is the pixels it 'sees' (kernel_size + 2)
field_of_view = np.squeeze(
np.int16(np.expand_dims(left_boundary, axis=1) + np.arange(expanded_kernel_width) - 1)
)

# Assign weight to each pixel in the field of view. A matrix whos horizontal dim is the output pixels and the
# Assign weight to each pixel in the field of view. A matrix whose horizontal dim is the output pixels and the
# vertical dim is a list of weights matching to the pixel in the field of view (that are specified in
# 'field_of_view')
weights = fixed_kernel(1.0 * np.expand_dims(match_coordinates, axis=1) - field_of_view - 1)
Expand Down
12 changes: 6 additions & 6 deletions examples/community/hd_painter.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,16 +201,16 @@ def __call__(
# ================================================== #
# We use a hack by running the code from the BasicTransformerBlock that is between Self and Cross attentions here
# The other option would've been modifying the BasicTransformerBlock and adding this functionality here.
# I assumed that changing the BasicTransformerBlock would have been a bigger deal and decided to use this hack isntead.
# I assumed that changing the BasicTransformerBlock would have been a bigger deal and decided to use this hack instead.

# The SelfAttention block recieves the normalized latents from the BasicTransformerBlock,
# The SelfAttention block receives the normalized latents from the BasicTransformerBlock,
# But the residual of the output is the non-normalized version.
# Therefore we unnormalize the input hidden state here
unnormalized_input_hidden_states = (
input_hidden_states + self.transformer_block.norm1.bias
) * self.transformer_block.norm1.weight

# TODO: return if neccessary
# TODO: return if necessary
# if self.use_ada_layer_norm_zero:
# attn_output = gate_msa.unsqueeze(1) * attn_output
# elif self.use_ada_layer_norm_single:
Expand All @@ -220,7 +220,7 @@ def __call__(
if transformer_hidden_states.ndim == 4:
transformer_hidden_states = transformer_hidden_states.squeeze(1)

# TODO: return if neccessary
# TODO: return if necessary
# 2.5 GLIGEN Control
# if gligen_kwargs is not None:
# transformer_hidden_states = self.fuser(transformer_hidden_states, gligen_kwargs["objs"])
Expand Down Expand Up @@ -266,7 +266,7 @@ def __call__(
) = cross_attention_input_hidden_states.chunk(2)

# Same split for the encoder_hidden_states i.e. the tokens
# Since the SelfAttention processors don't get the encoder states as input, we inject them into the processor in the begining.
# Since the SelfAttention processors don't get the encoder states as input, we inject them into the processor in the beginning.
_encoder_hidden_states_unconditional, encoder_hidden_states_conditional = self.encoder_hidden_states.chunk(
2
)
Expand Down Expand Up @@ -896,7 +896,7 @@ def __call__(
class GaussianSmoothing(nn.Module):
"""
Apply gaussian smoothing on a
1d, 2d or 3d tensor. Filtering is performed seperately for each channel
1d, 2d or 3d tensor. Filtering is performed separately for each channel
in the input using a depthwise convolution.

Args:
Expand Down
2 changes: 1 addition & 1 deletion examples/community/img2img_inpainting.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ def __call__(
`Image`, or tensor representing an image batch which will be inpainted, *i.e.* parts of the image will
be masked out with `mask_image` and repainted according to `prompt`.
inner_image (`torch.Tensor` or `PIL.Image.Image`):
`Image`, or tensor representing an image batch which will be overlayed onto `image`. Non-transparent
`Image`, or tensor representing an image batch which will be overlaid onto `image`. Non-transparent
regions of `inner_image` must fit inside white pixels in `mask_image`. Expects four channels, with
the last channel representing the alpha channel, which will be used to blend `inner_image` with
`image`. If not provided, it will be forcibly cast to RGBA.
Expand Down
4 changes: 2 additions & 2 deletions examples/community/latent_consistency_img2img.py
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,7 @@ def _threshold_sample(self, sample: torch.Tensor) -> torch.Tensor:
return sample

def set_timesteps(
self, stength, num_inference_steps: int, lcm_origin_steps: int, device: Union[str, torch.device] = None
self, strength, num_inference_steps: int, lcm_origin_steps: int, device: Union[str, torch.device] = None
):
"""
Sets the discrete timesteps used for the diffusion chain (to be run before inference).
Expand All @@ -668,7 +668,7 @@ def set_timesteps(
# LCM Timesteps Setting: # Linear Spacing
c = self.config.num_train_timesteps // lcm_origin_steps
lcm_origin_timesteps = (
np.asarray(list(range(1, int(lcm_origin_steps * stength) + 1))) * c - 1
np.asarray(list(range(1, int(lcm_origin_steps * strength) + 1))) * c - 1
) # LCM Training Steps Schedule
skipping_step = len(lcm_origin_timesteps) // num_inference_steps
timesteps = lcm_origin_timesteps[::-skipping_step][:num_inference_steps] # LCM Inference Steps Schedule
Expand Down
2 changes: 1 addition & 1 deletion examples/community/magic_mix.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ def __call__(

input = (
(mix_factor * latents) + (1 - mix_factor) * orig_latents
) # interpolating between layout noise and conditionally generated noise to preserve layout sematics
) # interpolating between layout noise and conditionally generated noise to preserve layout semantics
input = torch.cat([input] * 2)

else: # content generation phase
Expand Down
6 changes: 3 additions & 3 deletions examples/community/mixture_tiling.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,9 @@ def __call__(
guidance_scale_tiles: specific weights for classifier-free guidance in each tile.
guidance_scale_tiles: specific weights for classifier-free guidance in each tile. If None, the value provided in guidance_scale will be used.
seed_tiles: specific seeds for the initialization latents in each tile. These will override the latents generated for the whole canvas using the standard seed parameter.
seed_tiles_mode: either "full" "exclusive". If "full", all the latents affected by the tile be overriden. If "exclusive", only the latents that are affected exclusively by this tile (and no other tiles) will be overriden.
seed_reroll_regions: a list of tuples in the form (start row, end row, start column, end column, seed) defining regions in pixel space for which the latents will be overriden using the given seed. Takes priority over seed_tiles.
cpu_vae: the decoder from latent space to pixel space can require too mucho GPU RAM for large images. If you find out of memory errors at the end of the generation process, try setting this parameter to True to run the decoder in CPU. Slower, but should run without memory issues.
seed_tiles_mode: either "full" "exclusive". If "full", all the latents affected by the tile be overridden. If "exclusive", only the latents that are affected exclusively by this tile (and no other tiles) will be overridden.
seed_reroll_regions: a list of tuples in the form (start row, end row, start column, end column, seed) defining regions in pixel space for which the latents will be overridden using the given seed. Takes priority over seed_tiles.
cpu_vae: the decoder from latent space to pixel space can require too much GPU RAM for large images. If you find out of memory errors at the end of the generation process, try setting this parameter to True to run the decoder in CPU. Slower, but should run without memory issues.

Examples:

Expand Down
2 changes: 1 addition & 1 deletion examples/community/pipeline_controlnet_xl_kolors.py
Original file line number Diff line number Diff line change
Expand Up @@ -1258,7 +1258,7 @@ def _cn_patch_forward(*args, **kwargs):
)

if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1462,7 +1462,7 @@ def _cn_patch_forward(*args, **kwargs):
)

if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
Expand Down
Loading
Loading