Skip to content

Conversation

@TomAugspurger
Copy link
Contributor

No description provided.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 13, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@TomAugspurger
Copy link
Contributor Author

/ok to test ab18e77

@TomAugspurger
Copy link
Contributor Author

/ok to test 46b1d7b

@pentschev
Copy link
Member

Is there a benefit in having this always turned on? It might be useful to look at #612 since this can be very verbose and thus difficult to handle in practice for regular builds.

@TomAugspurger
Copy link
Contributor Author

/ok to test 105c60a

@TomAugspurger
Copy link
Contributor Author

/ok to test 17bc5e6

@TomAugspurger
Copy link
Contributor Author

/ok to test 1fa6250

@nirandaperera nirandaperera added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 17, 2025
@nirandaperera
Copy link
Contributor

/ok to test

@nirandaperera
Copy link
Contributor

I tested out the HostBuffer with @TomAugspurger's reproducer.

image

Left is with HostBuffer and right is with std::vector.
Now, Buffer::allocate time is insignificant. However, now cudaMemcpyAsync is taking up almost all the gains. The total time for each insert is more or less the same. 😕
Does cudaMemcpyAsync initialize memory before copying?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants