Add simple inference FLOP counter to `calc_transformer_flops.py` #31

haileyschoelkopf · 2024-02-19T15:26:24Z

Output without --infer (same as before this PR):

python calc/calc_transformer_flops.py 

Example with Fairseq-MoE 15B: python calc_transformer_flops.py -l 12 -hs 768 --moe -e 512
Example with GPT-3 175B: python calc_transformer_flops.py -l 96 -hs 12288
Namespace(vocab_size=51200, hidden_size=6144, sequence_length=2048, num_layers=44, kv_size_ratio=1.0, moe=False, num_experts=128, expert_interval=2, topk=1, batch_size=1, tokens=300000000000.0, checkpoint_activations=True, infer=False)
Calculating number of FLOPs with training configuration: {'vocab_size': 51200, 'hidden_size': 6144, 'sequence_length': 2048, 'num_layers': 44, 'kv_size_ratio': 1.0, 'moe': False, 'num_experts': 128, 'expert_interval': 2, 'topk': 1, 'batch_size': 1, 'tokens': 300000000000.0, 'checkpoint_activations': True, 'infer': False}

QKV FLOPs: 11.96 ZFLOPs
Attention Matrix FLOPs: 1.33 ZFLOPs
Attention Over Values FLOPs: 1.33 ZFLOPs
Linear Projection FLOPs: 3.99 ZFLOPs
FFN FLOPs: 31.89 ZFLOPs
Embedding FLOPs: 566.23 EFLOPs
Total FLOPs for the Model: 51.06 ZFLOPs

Output with --infer:

> python calc/calc_transformer_flops.py --infer

Example with Fairseq-MoE 15B: python calc_transformer_flops.py -l 12 -hs 768 --moe -e 512
Example with GPT-3 175B: python calc_transformer_flops.py -l 96 -hs 12288
Namespace(vocab_size=51200, hidden_size=6144, sequence_length=2048, num_layers=44, kv_size_ratio=1.0, moe=False, num_experts=128, expert_interval=2, topk=1, batch_size=1, tokens=300000000000.0, checkpoint_activations=True, infer=True)
Calculating number of FLOPs with training configuration: {'vocab_size': 51200, 'hidden_size': 6144, 'sequence_length': 2048, 'num_layers': 44, 'kv_size_ratio': 1.0, 'moe': False, 'num_experts': 128, 'expert_interval': 2, 'topk': 1, 'batch_size': 1, 'tokens': 300000000000.0, 'checkpoint_activations': True, 'infer': True}

QKV FLOPs: 2.99 ZFLOPs
Attention Matrix FLOPs: 332.19 EFLOPs
Attention Over Values FLOPs: 332.19 EFLOPs
Linear Projection FLOPs: 996.57 EFLOPs
FFN FLOPs: 7.97 ZFLOPs
Embedding FLOPs: 566.23 EFLOPs
Total FLOPs for the Model: 13.19 ZFLOPs

inference cuts flop counts to 1/3 (if no activation ckpting) or 1/4 (if activation ckpting) as expected.

This is a very naive way of calculating "true" inference costs, though may hopefully be useful, especially if one only were wanting to run only a forward pass e.g. to get perplexity / loglikelihoods / embeddings on a dataset of X tokens.

Quentin-Anthony · 2024-02-19T23:07:14Z

This doesn't account for kv-caching (see https://kipp.ly/transformer-inference-arithmetic/), yes? We should either add a comment to that effect, or add kv-caching flops.

haileyschoelkopf · 2024-02-20T01:19:10Z

Yep, this is prefill-only with no generated tokens for now--it should match what's used for calculations by https://arxiv.org/pdf/2401.00448.pdf and https://arxiv.org/abs/2304.03208 for inference FLOP budget-adjusted scaling laws.

Happy to add KV-caching flops if there's a good UX for it! not sure if it would clutter this mostly-training script (as we'd need to specify prefill sequence length, and total generated tokens too. I guess if there are no objections to having that extra # of generated tokens arg, adding it isn't too bad!). Would it make sense to start thinking about expanding to other useful inference napkin-math scripts / helpful tooling in that scenario?

Quentin-Anthony · 2024-02-20T02:56:07Z

I think this comment clears it up sufficiently. Thanks!

stas00 · 2024-02-20T18:46:31Z

Have you tried the new torch.utils.flop_counter which does it automatically and should do the right thing: https://gist.github.com/Chillee/07b36672a0ca2d1280e42b8d10f23174

haileyschoelkopf · 2024-02-20T19:02:36Z

Haven't tried it but been meaning to! this might make sense if #1 were addressed, although I think we wouldn't want to require a forward pass to compute flops, for very large models

stas00 · 2024-02-20T19:31:15Z

right, from that perspective, yes, the estimate is better

but the estimated flops could be quite off when torch.compile or fusion is used, no?

add --infer arg to flops calculator

7d349b5

haileyschoelkopf linked an issue Feb 19, 2024 that may be closed by this pull request

Inference FLOPs #2

Closed

haileyschoelkopf added 3 commits February 19, 2024 20:04

add comment

47eae1a

fix comment

c20995a

Update calc_transformer_flops.py

05c4066

Quentin-Anthony approved these changes Feb 20, 2024

View reviewed changes

Quentin-Anthony merged commit 56aeee1 into main Feb 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add simple inference FLOP counter to `calc_transformer_flops.py` #31

Add simple inference FLOP counter to `calc_transformer_flops.py` #31

Uh oh!

haileyschoelkopf commented Feb 19, 2024

Uh oh!

Quentin-Anthony commented Feb 19, 2024

Uh oh!

haileyschoelkopf commented Feb 20, 2024

Uh oh!

Quentin-Anthony commented Feb 20, 2024

Uh oh!

stas00 commented Feb 20, 2024 •

edited

Loading

Uh oh!

haileyschoelkopf commented Feb 20, 2024

Uh oh!

stas00 commented Feb 20, 2024

Uh oh!

Uh oh!

Add simple inference FLOP counter to calc_transformer_flops.py #31

Add simple inference FLOP counter to calc_transformer_flops.py #31

Uh oh!

Conversation

haileyschoelkopf commented Feb 19, 2024

Uh oh!

Quentin-Anthony commented Feb 19, 2024

Uh oh!

haileyschoelkopf commented Feb 20, 2024

Uh oh!

Quentin-Anthony commented Feb 20, 2024

Uh oh!

stas00 commented Feb 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haileyschoelkopf commented Feb 20, 2024

Uh oh!

stas00 commented Feb 20, 2024

Uh oh!

Uh oh!

Add simple inference FLOP counter to `calc_transformer_flops.py` #31

Add simple inference FLOP counter to `calc_transformer_flops.py` #31

stas00 commented Feb 20, 2024 •

edited

Loading