Add tuned parameters for Qwen/Qwen2.5-32B #8966

yarongmu-google · 2025-04-11T21:06:25Z

num_q_heads num_kv_heads max_num_batched_tokens max_num_seqs input_len output_len best_block_size

10 2 4096 128 1800 128 (128, 32)

5 1 4096 128 1800 128 (128, 64)

10 2 1024 1024 2000 48 (128, 32)

5 1 1024 1024 2000 48 (128, 32)

10 2 2048 128 1800 128 (128, 32)

5 1 2048 128 1800 128 (128, 32)

10 2 1024 128 1800 128 (128, 32)

5 1 1024 128 1800 128 (128, 32)

10 2 4096 1024 2000 48 (128, 32)

5 1 4096 1024 2000 48 (128, 64)

10 2 2048 1024 2000 48 (128, 32)

5 1 2048 1024 2000 48 (128, 32)

Signed-off-by: Yarong Mu <[email protected]>

yaochengji

LGTM, thanks, Yarong!

Signed-off-by: Yarong Mu <[email protected]>

yarongmu-google added 3 commits April 11, 2025 14:03

Add tuned shapes for Qwen/Qwen2.5-32B

a8828e3

Signed-off-by: Yarong Mu <[email protected]>

Add tuned shapes for Qwen/Qwen2.5-32B

711c16f

Signed-off-by: Yarong Mu <[email protected]>

Remove comment

788358e

Signed-off-by: Yarong Mu <[email protected]>

yaochengji approved these changes Apr 11, 2025

View reviewed changes

Fix lint errors

495a4e7

Signed-off-by: Yarong Mu <[email protected]>

yaochengji enabled auto-merge (squash) April 11, 2025 22:06

qihqi approved these changes Apr 11, 2025

View reviewed changes

yaochengji merged commit 4eea5e1 into pytorch:master Apr 12, 2025
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tuned parameters for Qwen/Qwen2.5-32B #8966

Add tuned parameters for Qwen/Qwen2.5-32B #8966

Uh oh!

yarongmu-google commented Apr 11, 2025

Uh oh!

yaochengji left a comment

Uh oh!

Uh oh!

Uh oh!

num_q_heads	num_kv_heads	max_num_batched_tokens	max_num_seqs	input_len	output_len	best_block_size
10	2	4096	128	1800	128	(128, 32)
5	1	4096	128	1800	128	(128, 64)
10	2	1024	1024	2000	48	(128, 32)
5	1	1024	1024	2000	48	(128, 32)
10	2	2048	128	1800	128	(128, 32)
5	1	2048	128	1800	128	(128, 32)
10	2	1024	128	1800	128	(128, 32)
5	1	1024	128	1800	128	(128, 32)
10	2	4096	1024	2000	48	(128, 32)
5	1	4096	1024	2000	48	(128, 64)
10	2	2048	1024	2000	48	(128, 32)
5	1	2048	1024	2000	48	(128, 32)

Add tuned parameters for Qwen/Qwen2.5-32B #8966

Add tuned parameters for Qwen/Qwen2.5-32B #8966

Uh oh!

Conversation

yarongmu-google commented Apr 11, 2025

Uh oh!

yaochengji left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!