Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New hardware results: Rdna3 7900xt/x ,geforce 4090/4080/4070 and Intel Arc A770 results? #99

Open
oscarbg opened this issue Jan 31, 2023 · 5 comments

Comments

@oscarbg
Copy link

oscarbg commented Jan 31, 2023

Hi,

Title says it all..

Wanting to see results of new Nv 40x0 series, Amd rdna3 and intel dg2..

hope people with needed hardware can submit them..

Thanks..

@oscarbg oscarbg changed the title New hardware results: Rdna3 7900xt/x and geforce 4090/4080/4070 results, Intel Arc A770 results? New hardware results: Rdna3 7900xt/x and geforce 4090/4080/4070 , Intel Arc A770 results? Jan 31, 2023
@oscarbg oscarbg changed the title New hardware results: Rdna3 7900xt/x and geforce 4090/4080/4070 , Intel Arc A770 results? New hardware results: Rdna3 7900xt/x ,geforce 4090/4080/4070 annd Intel Arc A770 results? Jan 31, 2023
@oscarbg oscarbg changed the title New hardware results: Rdna3 7900xt/x ,geforce 4090/4080/4070 annd Intel Arc A770 results? New hardware results: Rdna3 7900xt/x ,geforce 4090/4080/4070 and Intel Arc A770 results? Jan 31, 2023
@al42and
Copy link

al42and commented Mar 13, 2023

My results with 6.2.1 kernel for Arc A770:

Platform: Intel(R) OpenCL HD Graphics
  Device: Intel(R) Graphics [0x56a0]
    Driver version  : 22.49.25018.24 (Linux x64)
    Compute units   : 512
    Clock frequency : 2400 MHz

    Global memory bandwidth (GBPS)
      float   : 397.92
      float2  : 403.43
      float4  : 407.01
      float8  : 417.52
      float16 : 421.01

    Single-precision compute (GFLOPS)
      float   : 13018.01
      float2  : 11137.58
      float4  : 10403.04
      float8  : 10026.99
      float16 : 9701.60

    Half-precision compute (GFLOPS)
      half   : 19552.90
      half2  : 19493.52
      half4  : 19526.21
      half8  : 19459.81
      half16 : 19340.77

    No double precision support! Skipped

    Integer compute (GIOPS)
      int   : 4765.67
      int2  : 4773.43
      int4  : 4789.65
      int8  : 4644.51
      int16 : 5455.67

    Integer compute Fast 24bit (GIOPS)
      int   : 4755.75
      int2  : 4768.87
      int4  : 4786.68
      int8  : 4642.19
      int16 : 5455.34

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 2.64
      enqueueReadBuffer               : 2.43
      enqueueWriteBuffer non-blocking : 2.85
      enqueueReadBuffer non-blocking  : 2.63
      enqueueMapBuffer(for read)      : 2.83
        memcpy from mapped ptr        : 14.38
      enqueueUnmap(after write)       : 2.91
        memcpy to mapped ptr          : 14.01

    Kernel launch latency : 36.30 us

@oscarbg
Copy link
Author

oscarbg commented Mar 16, 2023

@al42and nice.. thanks for sharing.. would be nice to have Windows results also to see they not diverge much if you have Windows installed also..

@al42and
Copy link

al42and commented Mar 16, 2023

Don't have Windows :(

@retoXD
Copy link

retoXD commented Apr 14, 2023

Kernel latency seems worse on Windows.

Platform: Intel(R) OpenCL HD 
  Graphics Device: Intel(R) Arc(TM) A770 
    Graphics Driver version : 31.0.101.4255 (Win64) 
    Compute units : 512 
    Clock frequency : 2400 MHz

Global memory bandwidth (GBPS)
  float   : 396.30
  float2  : 403.57
  float4  : 409.15
  float8  : 419.49
  float16 : 423.01

Single-precision compute (GFLOPS)
  float   : 13346.34
  float2  : 11416.61
  float4  : 10663.24
  float8  : 10299.98
  float16 : 9975.71

Half-precision compute (GFLOPS)
  half   : 20033.96
  half2  : 19979.07
  half4  : 19969.53
  half8  : 19922.98
  half16 : 19841.67

No double precision support! Skipped

Integer compute (GIOPS)
  int   : 4830.21
  int2  : 4857.29
  int4  : 4846.14
  int8  : 4724.30
  int16 : 5532.68

Integer compute Fast 24bit (GIOPS)
  int   : 4824.44
  int2  : 4850.69
  int4  : 4829.88
  int8  : 4694.66
  int16 : 5510.71

Transfer bandwidth (GBPS)
  enqueueWriteBuffer              : 11.21
  enqueueReadBuffer               : 5.33
  enqueueWriteBuffer non-blocking : 15.99
  enqueueReadBuffer non-blocking  : 6.21
  enqueueMapBuffer(for read)      : 19.14
    memcpy from mapped ptr        : 19.38
  enqueueUnmap(after write)       : 17.15
    memcpy to mapped ptr          : 19.76

Kernel launch latency : 78.90 us

@leuc
Copy link

leuc commented Apr 22, 2023

kernel 5.17.0-1020-oem and intel-i915-dkms 1.23.3.19.230122.18.5.17.0.1020+i38-1 but bandwidth capped with PCI 3.0

Platform: Intel(R) OpenCL HD Graphics
  Device: Intel(R) Arc(TM) A770 Graphics
    Driver version  : 23.05.25593.18 (Linux x64)
    Compute units   : 512
    Clock frequency : 2400 MHz

    Global memory bandwidth (GBPS)
      float   : 399.42
      float2  : 403.78
      float4  : 408.53
      float8  : 418.51
      float16 : 422.97

    Single-precision compute (GFLOPS)
      float   : 13000.09
      float2  : 11134.71
      float4  : 10402.13
      float8  : 10024.48
      float16 : 9706.12

    Half-precision compute (GFLOPS)
      half   : 19552.26
      half2  : 19500.15
      half4  : 19505.83
      half8  : 19463.29
      half16 : 19341.72

    No double precision support! Skipped

    Integer compute (GIOPS)
      int   : 4311.91
      int2  : 4322.29
      int4  : 4339.57
      int8  : 4212.78
      int16 : 4920.77

    Integer compute Fast 24bit (GIOPS)
      int   : 4307.33
      int2  : 4327.73
      int4  : 4341.63
      int8  : 4203.23
      int16 : 4906.83

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 9.47
      enqueueReadBuffer               : 4.50
      enqueueWriteBuffer non-blocking : 11.07
      enqueueReadBuffer non-blocking  : 4.86
      enqueueMapBuffer(for read)      : 10.10
        memcpy from mapped ptr        : 4.80
      enqueueUnmap(after write)       : 11.38
        memcpy to mapped ptr          : 15.45

    Kernel launch latency : 9.05 us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants