adding Context Length Specialization (CCL) #429

quic-vjanfaza · 2025-06-03T17:16:59Z

Context-Length-Specialization technique optimizes the throughput of large language models (LLMs) on Qualcomm devices when handling very large context lengths. The current Ahead Of Time (AOT) compilation on Qualcomm devices doesn't predict the number of tokens needed, leading to significant throughput drops during the prefilling and the decoding phases. This happens because the system performs attention calculations based on large context length. To address this issue, we introduce Compute Context Length (CCL), an additional ONNX variable that allows for dynamic context-length specialization. By generating tokens using smaller, more manageable context lengths (CCL), we optimize memory reads and attention calculations, thereby improving throughput.

Signed-off-by: vjanfaza <[email protected]>

quic-vjanfaza added 4 commits May 13, 2025 09:01

adding Context Length Specialization (CCL)

563cac3

Signed-off-by: vjanfaza <[email protected]>

adding Context Length Specialization (CCL)

8099c47

Signed-off-by: vjanfaza <[email protected]>

adding Context Length Specialization (CCL)

53ac080

Signed-off-by: vjanfaza <[email protected]>

adding Context Length Specialization (CCL)

825b98c

Signed-off-by: vjanfaza <[email protected]>

quic-vjanfaza requested review from quic-rishinr, ochougul, quic-hemagnih and quic-amitraj as code owners June 3, 2025 17:17

quic-vjanfaza changed the title ~~Compute context length~~ adding Context Length Specialization (CCL) Jun 3, 2025

adding Context Length Specialization (CCL)

a3505a7

Signed-off-by: vjanfaza <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

adding Context Length Specialization (CCL) #429

adding Context Length Specialization (CCL) #429

quic-vjanfaza commented Jun 3, 2025

Uh oh!

Uh oh!

adding Context Length Specialization (CCL) #429

Are you sure you want to change the base?

adding Context Length Specialization (CCL) #429

Conversation

quic-vjanfaza commented Jun 3, 2025

Uh oh!

Uh oh!