Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] get compressed data from device: get always null values #101

Open
andreamartini opened this issue Apr 12, 2024 · 5 comments
Open

[QST] get compressed data from device: get always null values #101

andreamartini opened this issue Apr 12, 2024 · 5 comments
Labels
inactive-30d question Further information is requested

Comments

@andreamartini
Copy link

Hi,
I have been experimenting with the nvCOMP API, the low level quick start example ( “low_level_quickstart_example.cpp”).
I am looking for obtaining the compressed data that should be generated by the nvcompBatchedLZ4CompressAsync call, and than save them to files.
After nvcompBatchedLZ4CompressAsync call (terminated with success), i am able to get the host_compressed_bytes for each chunk in this way:

size_t* host_compressed_bytes;
cudaMallocHost(&host_compressed_bytes, sizeof(size_t) * batch_size);
cudaMemcpy(host_compressed_bytes, device_compressed_bytes, sizeof(size_t) * batch_size, cudaMemcpyDeviceToHost);

Now, host_compressed_bytes[i] contains the chunk size in bytes, and i print the results to console:

     [COMPRESSION] Chunk [0] compressed data has size : 65794 bytes
     [COMPRESSION] Chunk [1] compressed data has size : 65794 bytes
     [COMPRESSION] Chunk [2] compressed data has size : 65794 bytes
     [COMPRESSION] Chunk [3] compressed data has size : 65794 bytes
     [COMPRESSION] Chunk [4] compressed data has size : 65794 bytes
      ...
     [COMPRESSION] Chunk [14] compressed data has size : 65794 bytes
     [COMPRESSION] Chunk [15] compressed data has size : 17028 bytes

Following a snippet code for copying from device to host:

cudaFreeHost(host_compressed_ptrs);
cudaMallocHost(&host_compressed_ptrs, sizeof(size_t) * batch_size);
for (size_t i = 0; i < batch_size; i++)
{
cudaMallocHost(&host_compressed_ptrs[i], host_compressed_bytes[i]);
printf(" host_compressed_ptrs[%zu] = %p\n", i, host_compressed_ptrs[i]);
cudaMemcpy(host_compressed_ptrs[i], &device_compressed_ptrs[i], host_compressed_bytes[i], cudaMemcpyDeviceToHost);
}

with result:
host_compressed_ptrs[0] = 00000002052F4C00
host_compressed_ptrs[1] = 0000000205304E00
host_compressed_ptrs[2] = 0000000205315000
host_compressed_ptrs[3] = 0000000205325200
host_compressed_ptrs[4] = 0000000205335400
...
host_compressed_ptrs[14] = 00000002053D6800
host_compressed_ptrs[15] = 00000002053E6A00

The problem is that i cannot get compressed data from device, i mean, i get for each chunk a sequence of null value.
Infact, if i try to print the first and last 5 bytes of each chunk, i get:

 for (size_t i = 0; i < batch_size; i++)
    {
        uint8_t leading_values[5] = { 0x00 };
        uint8_t trailing_values[5] = { 0x00 };
        size_t start_ofs = 0;
        size_t end_ofs = 5;
        size_t idx = 0;
        for (size_t j = start_ofs; j < end_ofs; j++)
            leading_values[idx++] = host_compressed_ptrs[i][j];
...
  host_compressed_ptrs[0] = 00 00 00 00 00  . . .  00 00 00 00 00
    host_compressed_ptrs[1] = 00 00 00 00 00  . . .  00 00 00 00 00
    host_compressed_ptrs[2] = 00 00 00 00 00  . . .  00 00 00 00 00
    host_compressed_ptrs[3] = 00 00 00 00 00  . . .  00 00 00 00 00
    host_compressed_ptrs[4] = 00 00 00 00 00  . . .  00 00 00 00 00
   ...
    host_compressed_ptrs[14] = 00 00 00 00 00  . . .  00 00 00 00 00
    host_compressed_ptrs[15] = 00 00 00 00 00  . . .  00 00 00 00 00

May you please advise what I am doing wrong? thank you in advance.

NB:
I am using:

  • Windows 11
  • Visual studio 2019
  • nvidia Geforce Gtx 1650 (laptop)
  • Driver 552.12 april 2024
  • nvcomp 3.0.6 (i got the same problem with previous 3.0.3)

Ref. to. https://forums.developer.nvidia.com/t/nvcomp-get-compressed-data-from-device/288791

@andreamartini andreamartini added ? - Needs Triage question Further information is requested labels Apr 12, 2024
@vmetodiev
Copy link

Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@naveenaero
Copy link
Collaborator

naveenaero commented Oct 25, 2024

Hi @andreamartini
Is there a cudaStreamSynchronize call after the call to nvCOMP compression API? The absence of that could explain the behavior. Can you provide a sample code to reproduce this issue?

Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive-30d question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants