Skip to content

Question on Reproducing Speedup Number in the Paper #1

@LeoXinhaoLee

Description

@LeoXinhaoLee

Hi, thank you for releasing the dataset for your inspiring work. I'm trying to reproduce the speedup over torch/torch.compile mentioned in your paper (appendix C.2 table 4, on H100).

I'm using the script run_kernel.py to evaluate all the kernels in your highlighted folder. I've been able to reproduce most of the results. However, I noticed that on mnist_cross_entropy forward, my measured speedup over torch.compile is very different from the paper (8.96 vs 24.87). And I'm wondering if this is expected (e.g., the kernel used in paper for this task is different from the one in the highlighted folder)?

Attached is a screenshot of my reproduced results. Thank you very much for your time and help!

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions