Skip to content

Unknown CUDA device compute capability: 8.7 #77

Open
@mstksg

Description

@mstksg

Thanks for the great project :)

Encountered this while running on a jetson AGX orin device (sm_87 arch), from looking at the source code a bit it seems like each capability needs to be explicitly accounted for?

$ nvidia-device-query
CUDA device query (Driver API, statically linked)
CUDA driver version 11.4
CUDA API version 11.4
Detected 1 CUDA capable device

Device 0: Orin
*** Warning: Unknown CUDA device compute capability: 8.7
*** Please submit a bug report at https://github.com/tmcdonell/cuda/issues

  CUDA capability:                          8.7
  CUDA cores:                               1024 cores in 16 multiprocessors (64 cores/MP)
  Global memory:                            61 GB
  Constant memory:                          64 kB
  Shared memory per block:                  48 kB
  Registers per block:                      65536
  Warp size:                                32
  Maximum threads per multiprocessor:       1536
  Maximum threads per block:                1024
  Maximum grid dimensions:                  2147483647 x 65535 x 65535
  Maximum block dimensions:                 1024 x 1024 x 64
  GPU clock rate:                           1.3 GHz
  Memory clock rate:                        1.3 GHz
  Memory bus width:                         128-bit
  L2 cache size:                            4 MB
  Maximum texture dimensions
    1D:                                     131072
    2D:                                     131072 x 65536
    3D:                                     16384 x 16384 x 16384
  Texture alignment:                        512 B
  Maximum memory pitch:                     2 GB
  Concurrent kernel execution:              Yes
  Concurrent copy and execution:            Yes, with 2 copy engines
  Runtime limit on kernel execution:        No
  Integrated GPU sharing host memory:       Yes
  Host page-locked memory mapping:          Yes
  ECC memory support:                       No
  Unified addressing (UVA):                 Yes
  Single to double precision performance:   32 : 1
  Supports compute pre-emption:             Yes
  Supports cooperative launch:              Yes
  Supports multi-device cooperative launch: Yes
  PCI bus/location:                         0/0
  Compute mode:                             Default
    Multiple contexts are allowed on the device simultaneously

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions