initial support blackwell #747

johnnynunez · 2025-01-21T22:59:59Z

10.0 blackwell b100/b200
12.0 blackwell rtx50

yzh119 · 2025-01-23T15:56:49Z

Hi @johnnynunez , thanks for bringing this up! Could we hold this PR and wait for the official release of torch 2.6 and blackwell software stack?

johnnynunez · 2025-01-23T16:01:24Z

Hi @johnnynunez , thanks for bringing this up! Could be hold this PR and wait for the official release of torch 2.6 and blackwell software stack?

Yeah for sure! I put all codegen blackwell family on pytorch.
Also you have references here:
NVIDIA/cccl#3493

johnnynunez · 2025-01-23T16:37:18Z

Hi @johnnynunez , thanks for bringing this up! Could we hold this PR and wait for the official release of torch 2.6 and blackwell software stack?

FYI: pytorch/pytorch#145436

johnnynunez · 2025-01-23T21:05:38Z

FYI: https://docs.nvidia.com/cuda/pdf/ptx_isa_8.7.pdf

yzh119 · 2025-01-23T21:38:54Z

FYI: https://docs.nvidia.com/cuda/pdf/ptx_isa_8.7.pdf

This is huge!

johnnynunez · 2025-01-25T10:17:03Z

@yzh119 can you merge?

zhyncs · 2025-01-25T10:31:16Z

@yzh119 can you merge?

@johnnynunez remind #747 (comment)

johnnynunez · 2025-01-25T10:32:39Z

well, sure... pytorch is coming this week : M6: Release Day (1/29/25)

ghostplant · 2025-02-16T15:17:53Z

Is there a prebuilt that can work for B200?

YavorGIvanov · 2025-04-16T15:37:41Z

What performance improvement should we expect out of the box on B200 compared to H100 SXM5 for different size models ? 8B, 70B, 400B. I expected to get some benefit even for 8B (e.g. 30% for low batch sizes), but I am getting no benefit using Llama 8B.

Also is there any planned on in-progress work on flashinfer utilizing B200 specific capabilities (e.g. Tensor Memory Accelerator) ?

johnnynunez added 2 commits January 21, 2025 23:58

initial support blackwell

0c2a721

initial support blackwell

5523aac

johnnynunez mentioned this pull request Jan 21, 2025

[Roadmap] FlashInfer v0.2 to v0.3 #675

Open

15 tasks

johnnynunez added 2 commits January 23, 2025 22:25

Update installation.rst

011036f

Update release_wheel.yml

8c308d5

Merge branch 'main' into main

aac6545

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial support blackwell #747

initial support blackwell #747

johnnynunez commented Jan 21, 2025

yzh119 commented Jan 23, 2025 •

edited

Loading

johnnynunez commented Jan 23, 2025

johnnynunez commented Jan 23, 2025

johnnynunez commented Jan 23, 2025 •

edited

Loading

yzh119 commented Jan 23, 2025

johnnynunez commented Jan 25, 2025

zhyncs commented Jan 25, 2025

johnnynunez commented Jan 25, 2025 •

edited

Loading

ghostplant commented Feb 16, 2025

YavorGIvanov commented Apr 16, 2025 •

edited

Loading

initial support blackwell #747

Are you sure you want to change the base?

initial support blackwell #747

Conversation

johnnynunez commented Jan 21, 2025

yzh119 commented Jan 23, 2025 • edited Loading

johnnynunez commented Jan 23, 2025

johnnynunez commented Jan 23, 2025

johnnynunez commented Jan 23, 2025 • edited Loading

yzh119 commented Jan 23, 2025

johnnynunez commented Jan 25, 2025

zhyncs commented Jan 25, 2025

johnnynunez commented Jan 25, 2025 • edited Loading

ghostplant commented Feb 16, 2025

YavorGIvanov commented Apr 16, 2025 • edited Loading

yzh119 commented Jan 23, 2025 •

edited

Loading

johnnynunez commented Jan 23, 2025 •

edited

Loading

johnnynunez commented Jan 25, 2025 •

edited

Loading

YavorGIvanov commented Apr 16, 2025 •

edited

Loading