[Feature] Can we record layer_id for DiT model? #9836

foreverpiano · 2024-11-01T14:26:31Z

Is your feature request related to a problem? Please describe.
Some layerwise algorithm may be based on layer-id.
just need some simple modification for transformer2Dmodel and its inner module like attention part, batch_norm part. just pass the layer_id as an extra parameter.

foreverpiano · 2024-11-01T14:32:50Z

You can assign me if you think it is reasonable. open to discuss.

yiyixuxu · 2024-11-01T19:34:30Z

thanks for the issue
I think some examples on how the layer_id is needed and used would be helpful:)

foreverpiano · 2024-11-02T14:01:18Z

@yiyixuxu https://github.com/Zefan-Cai/PyramidKV https://arxiv.org/html/2407.11550v3 these are LLM papers using layerwise feature. I believe that in DiT we can also use this strategies. It reminds me of the quantization model. We can choose different layer use different strategies. In current framework, we need to pass layer_id in pipeline, transformer_block, block, Attention, attention_processor, which is hard to make some simple modifications based on layer_id.

foreverpiano · 2024-11-10T04:28:49Z

@yiyixuxu #9177 this also requires layerwise information. So I think that it is a trend to include layer_id.

foreverpiano · 2024-11-10T04:29:47Z

And for debugging practice in #9508 and #9329, layer_id also helps with debugging.

a-r-r-o-w · 2024-11-17T07:28:51Z

I think recording extra information in models for inference-only purposes is not a good reason to support them. This information can be easily made available at the end-user level by some simple code like:

pipe = ...
transformer = pipe.transformer

for index, block in enumerate(transformer.transformer_blocks):
   block.layer_id = index

While I agree that it would be super helpful for debugging, these kind of changes introduce extra maintainence efforts and will add another step for model authors when trying to integrate their research contributions (for every kind of additional information we would like to maintain). Maybe we could provide debugging utils that create wrappers on the models with this kind of information instead? If you feel the above idea is not helpful, or have additional thoughts on why this would be impactful to have, I would be happy to hear and help with relevant changes!

foreverpiano · 2024-11-17T15:31:32Z

I agree that using a wrapper is a good design approach. I also want to point out something additional - the current modification only adds information to transformer_blocks, but not to the inner attention and MLP components. My thought is that a utility wrapper could automatically add debugging information to all inner blocks at once. Would this approach work for you? This way we can ensure comprehensive debugging information across the entire model hierarchy in a clean and maintainable way.

a-r-r-o-w · 2024-11-19T09:34:43Z

I like the idea of debugging utils and being able to inject this information if "debug" mode is enabled. cc @DN6

github-actions · 2024-12-13T15:03:30Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

foreverpiano changed the title ~~Can we record layer_id for DiT model?~~ [Feature] Can we record layer_id for DiT model? Nov 1, 2024

github-actions bot added the stale Issues that haven't received updates label Dec 13, 2024

a-r-r-o-w closed this as completed Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Can we record layer_id for DiT model? #9836

[Feature] Can we record layer_id for DiT model? #9836

foreverpiano commented Nov 1, 2024 •

edited

Loading

foreverpiano commented Nov 1, 2024

yiyixuxu commented Nov 1, 2024

foreverpiano commented Nov 2, 2024 •

edited

Loading

foreverpiano commented Nov 10, 2024

foreverpiano commented Nov 10, 2024

a-r-r-o-w commented Nov 17, 2024

foreverpiano commented Nov 17, 2024

a-r-r-o-w commented Nov 19, 2024

github-actions bot commented Dec 13, 2024

[Feature] Can we record layer_id for DiT model? #9836

[Feature] Can we record layer_id for DiT model? #9836

Comments

foreverpiano commented Nov 1, 2024 • edited Loading

foreverpiano commented Nov 1, 2024

yiyixuxu commented Nov 1, 2024

foreverpiano commented Nov 2, 2024 • edited Loading

foreverpiano commented Nov 10, 2024

foreverpiano commented Nov 10, 2024

a-r-r-o-w commented Nov 17, 2024

foreverpiano commented Nov 17, 2024

a-r-r-o-w commented Nov 19, 2024

github-actions bot commented Dec 13, 2024

foreverpiano commented Nov 1, 2024 •

edited

Loading

foreverpiano commented Nov 2, 2024 •

edited

Loading