Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure cuda::std::bit_cast works with vector types #3183

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Dec 17, 2024

We want to make sure we can use bit_cast with suitable vector or sum types like

  • cuda::std::array
  • c-arrays
  • cuda vector types like float2

Unfortunately we cannot use it with the extended floating point vector types, because they are not trivially copyable and also not trivially copy assignable

@miscco miscco requested review from a team as code owners December 17, 2024 11:27
@miscco miscco force-pushed the extent_bit_cast_array branch from 25dce73 to 0f52938 Compare December 17, 2024 11:52
Copy link
Contributor

🟨 CI finished in 1h 40m: Pass: 87%/170 | Total: 2d 14h | Avg: 22m 07s | Max: 1h 06m | Hits: 62%/20274
  • 🟨 libcudacxx: Pass: 56%/48 | Total: 13h 19m | Avg: 16m 38s | Max: 1h 03m | Hits: 51%/7578

    🔍 jobs: Build 🔍
      🔍 Build              Pass:  48%/41  | Total: 10h 22m | Avg: 15m 10s | Max: 32m 56s | Hits:  51%/7578  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 27m | Avg: 21m 55s | Max: 27m 21s
      🟩 Test               Pass: 100%/2   | Total:  1h 27m | Avg: 43m 34s | Max:  1h 03m
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
    🟨 ctk
      🟥 11.1               Pass:   0%/7   | Total:  1h 53m | Avg: 16m 16s | Max: 28m 47s
      🟩 12.5               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 59s | Max: 32m 16s
      🟨 12.6               Pass:  64%/39  | Total: 10h 21m | Avg: 15m 55s | Max:  1h 03m | Hits:  51%/7578  
    🟨 cudacxx
      🟥 ClangCUDA18        Pass:   0%/4   | Total:  1h 04m | Avg: 16m 05s | Max: 20m 57s
      🟥 nvcc11.1           Pass:   0%/7   | Total:  1h 53m | Avg: 16m 16s | Max: 28m 47s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 03m | Avg: 31m 59s | Max: 32m 16s
      🟨 nvcc12.6           Pass:  71%/35  | Total:  9h 16m | Avg: 15m 54s | Max:  1h 03m | Hits:  51%/7578  
    🟨 cxx
      🟥 Clang9             Pass:   0%/4   | Total:  1h 00m | Avg: 15m 02s | Max: 23m 08s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 42s | Avg:  4m 42s | Max:  4m 42s
      🟩 Clang11            Pass: 100%/1   | Total: 22m 08s | Avg: 22m 08s | Max: 22m 08s
      🟩 Clang12            Pass: 100%/1   | Total:  6m 51s | Avg:  6m 51s | Max:  6m 51s
      🟩 Clang13            Pass: 100%/1   | Total: 10m 29s | Avg: 10m 29s | Max: 10m 29s
      🟩 Clang14            Pass: 100%/1   | Total: 19m 57s | Avg: 19m 57s | Max: 19m 57s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
      🟩 Clang16            Pass: 100%/1   | Total: 21m 36s | Avg: 21m 36s | Max: 21m 36s
      🟩 Clang17            Pass: 100%/1   | Total: 20m 55s | Avg: 20m 55s | Max: 20m 55s
      🟨 Clang18            Pass:  37%/8   | Total:  2h 08m | Avg: 16m 03s | Max: 23m 38s
      🟥 GCC6               Pass:   0%/2   | Total: 30m 21s | Avg: 15m 10s | Max: 22m 22s
      🟥 GCC7               Pass:   0%/2   | Total: 18m 27s | Avg:  9m 13s | Max: 15m 11s
      🟥 GCC8               Pass:   0%/1   | Total: 20m 22s | Avg: 20m 22s | Max: 20m 22s
      🟥 GCC9               Pass:   0%/3   | Total: 31m 38s | Avg: 10m 32s | Max: 18m 07s
      🟥 GCC10              Pass:   0%/1   | Total:  6m 54s | Avg:  6m 54s | Max:  6m 54s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟩 GCC12              Pass: 100%/1   | Total: 22m 32s | Avg: 22m 32s | Max: 22m 32s
      🟨 GCC13              Pass:  90%/10  | Total:  3h 04m | Avg: 18m 26s | Max:  1h 03m
      🟥 Intel2023.2.0      Pass:   0%/1   | Total:  5m 42s | Avg:  5m 42s | Max:  5m 42s
      🟥 MSVC14.16          Pass:   0%/1   | Total: 28m 47s | Avg: 28m 47s | Max: 28m 47s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 32m 56s | Avg: 32m 56s | Max: 32m 56s | Hits:  31%/2477  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 49m 17s | Avg: 24m 38s | Max: 31m 45s | Hits:  60%/5101  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 59s | Max: 32m 16s
    🟨 cxx_family
      🟨 Clang              Pass:  55%/20  | Total:  4h 59m | Avg: 14m 59s | Max: 23m 38s
      🟨 GCC                Pass:  52%/21  | Total:  5h 18m | Avg: 15m 10s | Max:  1h 03m
      🟥 Intel              Pass:   0%/1   | Total:  5m 42s | Avg:  5m 42s | Max:  5m 42s
      🟨 MSVC               Pass:  75%/4   | Total:  1h 51m | Avg: 27m 45s | Max: 32m 56s | Hits:  51%/7578  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 59s | Max: 32m 16s
    🟨 gpu
      🟨 v100               Pass:  56%/48  | Total: 13h 19m | Avg: 16m 38s | Max:  1h 03m | Hits:  51%/7578  
    🟨 cpu
      🟨 amd64              Pass:  58%/46  | Total: 12h 48m | Avg: 16m 42s | Max:  1h 03m | Hits:  51%/7578  
      🟥 arm64              Pass:   0%/2   | Total: 30m 58s | Avg: 15m 29s | Max: 20m 23s
    🟨 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/4   | Total:  1h 04m | Avg: 16m 05s | Max: 20m 57s
      🟨 nvcc               Pass:  61%/44  | Total: 12h 14m | Avg: 16m 42s | Max:  1h 03m | Hits:  51%/7578  
    🟨 sm
      🟥 90                 Pass:   0%/1   | Total: 12m 51s | Avg: 12m 51s | Max: 12m 51s
      🟨 90a                Pass:  50%/2   | Total: 16m 46s | Avg:  8m 23s | Max: 12m 42s
    🟨 std
      🟨 11                 Pass:  16%/6   | Total:  1h 37m | Avg: 16m 10s | Max: 23m 08s
      🟨 14                 Pass:  20%/5   | Total:  1h 30m | Avg: 18m 03s | Max: 28m 47s
      🟨 17                 Pass:  53%/13  | Total:  3h 32m | Avg: 16m 19s | Max: 32m 56s | Hits:  31%/4954  
      🟨 20                 Pass:  73%/23  | Total:  6h 37m | Avg: 17m 17s | Max:  1h 03m | Hits:  88%/2624  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 05h | Avg: 37m 09s | Max: 1h 06m | Hits: 73%/3124

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 03h | Avg: 36m 57s | Max:  1h 06m | Hits:  73%/3124  
      🟩 arm64              Pass: 100%/2   | Total:  1h 23m | Avg: 41m 53s | Max: 42m 16s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  4h 10m | Avg: 35m 48s | Max: 54m 17s | Hits:  74%/781   
      🟩 12.5               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 54s | Max: 54m 28s
      🟩 12.6               Pass: 100%/38  | Total: 23h 12m | Avg: 36m 38s | Max:  1h 06m | Hits:  73%/2343  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 56m | Avg: 58m 25s | Max:  1h 06m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  4h 10m | Avg: 35m 48s | Max: 54m 17s | Hits:  74%/781   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 43m | Avg: 51m 54s | Max: 54m 28s
      🟩 nvcc12.6           Pass: 100%/36  | Total: 21h 15m | Avg: 35m 25s | Max: 53m 17s | Hits:  73%/2343  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 25s | Max:  1h 06m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 03h | Avg: 36m 13s | Max: 54m 28s | Hits:  73%/3124  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  2h 28m | Avg: 37m 13s | Max: 41m 49s
      🟩 Clang10            Pass: 100%/1   | Total: 41m 41s | Avg: 41m 41s | Max: 41m 41s
      🟩 Clang11            Pass: 100%/1   | Total: 39m 13s | Avg: 39m 13s | Max: 39m 13s
      🟩 Clang12            Pass: 100%/1   | Total: 38m 13s | Avg: 38m 13s | Max: 38m 13s
      🟩 Clang13            Pass: 100%/1   | Total: 38m 09s | Avg: 38m 09s | Max: 38m 09s
      🟩 Clang14            Pass: 100%/1   | Total: 39m 40s | Avg: 39m 40s | Max: 39m 40s
      🟩 Clang15            Pass: 100%/1   | Total: 37m 54s | Avg: 37m 54s | Max: 37m 54s
      🟩 Clang16            Pass: 100%/1   | Total: 40m 31s | Avg: 40m 31s | Max: 40m 31s
      🟩 Clang17            Pass: 100%/1   | Total: 38m 38s | Avg: 38m 38s | Max: 38m 38s
      🟩 Clang18            Pass: 100%/7   | Total:  4h 37m | Avg: 39m 40s | Max:  1h 06m
      🟩 GCC6               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 04s | Max: 33m 34s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 17m | Avg: 38m 43s | Max: 39m 43s
      🟩 GCC8               Pass: 100%/1   | Total: 39m 15s | Avg: 39m 15s | Max: 39m 15s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 44m | Avg: 34m 52s | Max: 41m 13s
      🟩 GCC10              Pass: 100%/1   | Total: 38m 23s | Avg: 38m 23s | Max: 38m 23s
      🟩 GCC11              Pass: 100%/1   | Total: 40m 16s | Avg: 40m 16s | Max: 40m 16s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 12m | Avg: 24m 00s | Max: 40m 53s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 31m | Avg: 26m 28s | Max: 43m 45s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 42m 18s | Avg: 42m 18s | Max: 42m 18s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 54m 17s | Avg: 54m 17s | Max: 54m 17s | Hits:  74%/781   
      🟩 MSVC14.29          Pass: 100%/1   | Total: 51m 23s | Avg: 51m 23s | Max: 51m 23s | Hits:  74%/781   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 44m | Avg: 52m 11s | Max: 53m 17s | Hits:  72%/1562  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 54s | Max: 54m 28s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 12h 20m | Avg: 38m 58s | Max:  1h 06m
      🟩 GCC                Pass: 100%/21  | Total: 10h 49m | Avg: 30m 56s | Max: 43m 45s
      🟩 Intel              Pass: 100%/1   | Total: 42m 18s | Avg: 42m 18s | Max: 42m 18s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 30m | Avg: 52m 30s | Max: 54m 17s | Hits:  73%/3124  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 43m | Avg: 51m 54s | Max: 54m 28s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 31m 07s | Avg: 15m 33s | Max: 16m 02s
      🟩 v100               Pass: 100%/45  | Total:  1d 04h | Avg: 38m 07s | Max:  1h 06m | Hits:  73%/3124  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 02h | Avg: 40m 21s | Max:  1h 06m | Hits:  73%/3124  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 16m 04s | Avg: 16m 04s | Max: 16m 04s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 12s | Avg: 17m 12s | Max: 17m 12s
      🟩 HostLaunch         Pass: 100%/3   | Total: 54m 24s | Avg: 18m 08s | Max: 19m 18s
      🟩 TestGPU            Pass: 100%/2   | Total: 44m 35s | Avg: 22m 17s | Max: 24m 41s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 31m 07s | Avg: 15m 33s | Max: 16m 02s
      🟩 90a                Pass: 100%/1   | Total: 14m 43s | Avg: 14m 43s | Max: 14m 43s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  3h 01m | Avg: 36m 15s | Max: 41m 49s
      🟩 14                 Pass: 100%/4   | Total:  2h 44m | Avg: 41m 12s | Max: 54m 17s | Hits:  74%/781   
      🟩 17                 Pass: 100%/12  | Total:  8h 25m | Avg: 42m 08s | Max: 51m 23s | Hits:  74%/1562  
      🟩 20                 Pass: 100%/26  | Total: 14h 54m | Avg: 34m 24s | Max:  1h 06m | Hits:  71%/781   
    
  • 🟩 thrust: Pass: 100%/46 | Total: 17h 35m | Avg: 22m 56s | Max: 1h 04m | Hits: 67%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 30m 00s | Avg: 15m 00s | Max: 18m 45s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total: 16h 56m | Avg: 23m 06s | Max:  1h 04m | Hits:  67%/9260  
      🟩 arm64              Pass: 100%/2   | Total: 38m 27s | Avg: 19m 13s | Max: 23m 06s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  2h 03m | Avg: 17m 36s | Max: 56m 10s | Hits:  59%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 51s | Max:  1h 01m
      🟩 12.6               Pass: 100%/37  | Total: 13h 34m | Avg: 22m 00s | Max:  1h 04m | Hits:  69%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 26m 10s | Avg: 13m 05s | Max: 13m 27s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  2h 03m | Avg: 17m 36s | Max: 56m 10s | Hits:  59%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 57m | Avg: 58m 51s | Max:  1h 01m
      🟩 nvcc12.6           Pass: 100%/35  | Total: 13h 08m | Avg: 22m 30s | Max:  1h 04m | Hits:  69%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 26m 10s | Avg: 13m 05s | Max: 13m 27s
      🟩 nvcc               Pass: 100%/44  | Total: 17h 08m | Avg: 23m 23s | Max:  1h 04m | Hits:  67%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 55m 04s | Avg: 13m 46s | Max: 22m 51s
      🟩 Clang10            Pass: 100%/1   | Total: 23m 44s | Avg: 23m 44s | Max: 23m 44s
      🟩 Clang11            Pass: 100%/1   | Total: 17m 00s | Avg: 17m 00s | Max: 17m 00s
      🟩 Clang12            Pass: 100%/1   | Total: 16m 34s | Avg: 16m 34s | Max: 16m 34s
      🟩 Clang13            Pass: 100%/1   | Total: 17m 37s | Avg: 17m 37s | Max: 17m 37s
      🟩 Clang14            Pass: 100%/1   | Total: 22m 55s | Avg: 22m 55s | Max: 22m 55s
      🟩 Clang15            Pass: 100%/1   | Total: 21m 00s | Avg: 21m 00s | Max: 21m 00s
      🟩 Clang16            Pass: 100%/1   | Total: 18m 39s | Avg: 18m 39s | Max: 18m 39s
      🟩 Clang17            Pass: 100%/1   | Total: 25m 39s | Avg: 25m 39s | Max: 25m 39s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 45m | Avg: 15m 02s | Max: 20m 05s
      🟩 GCC6               Pass: 100%/2   | Total: 21m 11s | Avg: 10m 35s | Max: 15m 04s
      🟩 GCC7               Pass: 100%/2   | Total: 26m 20s | Avg: 13m 10s | Max: 15m 24s
      🟩 GCC8               Pass: 100%/1   | Total: 19m 39s | Avg: 19m 39s | Max: 19m 39s
      🟩 GCC9               Pass: 100%/3   | Total: 49m 38s | Avg: 16m 32s | Max: 22m 43s
      🟩 GCC10              Pass: 100%/1   | Total: 21m 34s | Avg: 21m 34s | Max: 21m 34s
      🟩 GCC11              Pass: 100%/1   | Total: 23m 43s | Avg: 23m 43s | Max: 23m 43s
      🟩 GCC12              Pass: 100%/1   | Total: 26m 40s | Avg: 26m 40s | Max: 26m 40s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 10m | Avg: 16m 15s | Max: 25m 03s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 36m 25s | Avg: 36m 25s | Max: 36m 25s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 56m 10s | Avg: 56m 10s | Max: 56m 10s | Hits:  59%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 53m 23s | Avg: 53m 23s | Max: 53m 23s | Hits:  60%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 29m | Avg: 49m 43s | Max:  1h 04m | Hits:  72%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 51s | Max:  1h 01m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  5h 23m | Avg: 17m 01s | Max: 25m 39s
      🟩 GCC                Pass: 100%/19  | Total:  5h 18m | Avg: 16m 46s | Max: 26m 40s
      🟩 Intel              Pass: 100%/1   | Total: 36m 25s | Avg: 36m 25s | Max: 36m 25s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 18m | Avg: 51m 44s | Max:  1h 04m | Hits:  67%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 51s | Max:  1h 01m
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total: 17h 35m | Avg: 22m 56s | Max:  1h 04m | Hits:  67%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 16h 16m | Avg: 24m 24s | Max:  1h 04m | Hits:  59%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 37m 32s | Avg: 12m 30s | Max: 22m 30s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 41m 16s | Avg: 13m 45s | Max: 17m 40s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  8m 54s | Avg:  8m 54s | Max:  8m 54s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 44m 10s | Avg:  8m 50s | Max: 13m 16s
      🟩 14                 Pass: 100%/4   | Total:  1h 49m | Avg: 27m 22s | Max: 56m 10s | Hits:  59%/1852  
      🟩 17                 Pass: 100%/12  | Total:  6h 05m | Avg: 30m 28s | Max:  1h 04m | Hits:  59%/3704  
      🟩 20                 Pass: 100%/23  | Total:  8h 25m | Avg: 21m 59s | Max:  1h 01m | Hits:  79%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 06m | Avg: 4m 52s | Max: 23m 35s | Hits: 92%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 56m | Avg:  5m 16s | Max: 23m 35s | Hits:  92%/312   
      🟩 arm64              Pass: 100%/4   | Total: 10m 34s | Avg:  2m 38s | Max:  2m 43s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 14m 25s | Avg:  4m 48s | Max:  8m 27s | Hits:  92%/156   
      🟩 12.5               Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  5m 01s
      🟩 12.6               Pass: 100%/21  | Total:  1h 42m | Avg:  4m 52s | Max: 23m 35s | Hits:  92%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 14m 25s | Avg:  4m 48s | Max:  8m 27s | Hits:  92%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  5m 01s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 42m | Avg:  4m 52s | Max: 23m 35s | Hits:  92%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 06m | Avg:  4m 52s | Max: 23m 35s | Hits:  92%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 12s | Avg:  3m 12s | Max:  3m 12s
      🟩 Clang10            Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 03s | Avg:  3m 03s | Max:  3m 03s
      🟩 Clang12            Pass: 100%/1   | Total:  2m 57s | Avg:  2m 57s | Max:  2m 57s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 15s | Avg:  3m 15s | Max:  3m 15s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 06s | Avg:  3m 06s | Max:  3m 06s
      🟩 Clang18            Pass: 100%/4   | Total: 31m 58s | Avg:  7m 59s | Max: 23m 35s
      🟩 GCC9               Pass: 100%/1   | Total:  2m 46s | Avg:  2m 46s | Max:  2m 46s
      🟩 GCC10              Pass: 100%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 42s | Avg:  9m 51s | Max: 16m 37s
      🟩 GCC13              Pass: 100%/4   | Total: 10m 44s | Avg:  2m 41s | Max:  2m 58s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  8m 27s | Avg:  8m 27s | Max:  8m 27s | Hits:  92%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  8m 23s | Avg:  8m 23s | Max:  8m 23s | Hits:  92%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  5m 01s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 00m | Avg:  4m 39s | Max: 23m 35s
      🟩 GCC                Pass: 100%/9   | Total: 39m 14s | Avg:  4m 21s | Max: 16m 37s
      🟩 MSVC               Pass: 100%/2   | Total: 16m 50s | Avg:  8m 25s | Max:  8m 27s | Hits:  92%/312   
      🟩 NVHPC              Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  5m 01s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 06m | Avg:  4m 52s | Max: 23m 35s | Hits:  92%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 26m | Avg:  3m 35s | Max:  8m 27s | Hits:  92%/312   
      🟩 Test               Pass: 100%/2   | Total: 40m 12s | Avg: 20m 06s | Max: 23m 35s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s
      🟩 90a                Pass: 100%/1   | Total:  2m 36s | Avg:  2m 36s | Max:  2m 36s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 19m 13s | Avg:  3m 12s | Max:  5m 01s
      🟩 20                 Pass: 100%/20  | Total:  1h 47m | Avg:  5m 22s | Max: 23m 35s | Hits:  92%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 52s | Avg: 4m 56s | Max: 7m 50s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  7m 50s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 02s | Avg:  2m 02s | Max:  2m 02s
      🟩 Test               Pass: 100%/1   | Total:  7m 50s | Avg:  7m 50s | Max:  7m 50s
    
  • 🟩 python: Pass: 100%/1 | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 25m 16s | Avg: 25m 16s | Max: 25m 16s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 170)

# Runner
125 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
15 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@fbusato
Copy link
Contributor

fbusato commented Dec 17, 2024

great to see this extension for bit_cast!
AFAIK cuda vector types like float2 should be already trivially copyable. Please also consider std::pair

@miscco
Copy link
Collaborator Author

miscco commented Dec 17, 2024

great to see this extension for bit_cast! AFAIK cuda vector types like float2 should be already trivially copyable. Please also consider std::pair

They are, I was talking about __half2 and __nv_bfloat162

The issue with std::pair is that it is not triviallly copyable or trivially default constructible

@fbusato
Copy link
Contributor

fbusato commented Dec 17, 2024

I guess now you can replace the unsafe_bitcast function used in CUB
https://github.com/NVIDIA/cccl/blob/main/cub/cub/thread/thread_reduce.cuh#L179

@bernhardmgruber
Copy link
Contributor

I guess now you can replace the unsafe_bitcast function used in CUB https://github.com/NVIDIA/cccl/blob/main/cub/cub/thread/thread_reduce.cuh#L179

Please open an issue for that!

@fbusato
Copy link
Contributor

fbusato commented Dec 31, 2024

related issue #2445

miscco added 2 commits January 6, 2025 09:34
We want to make sure we can use `bit_cast` with suitable vector or sum types like
* `cuda::std::array`
* c-arrays
* cuda vector types like float2

Unfortunately we cannot use it with the extended floating point vector types, because they are not trivially copyable and also not trivially copy assignable
@miscco miscco force-pushed the extent_bit_cast_array branch from 0f52938 to 5e3f9e3 Compare January 7, 2025 16:55
Copy link
Contributor

github-actions bot commented Jan 7, 2025

🟨 CI finished in 2h 07m: Pass: 87%/170 | Total: 3d 12h | Avg: 29m 57s | Max: 1h 16m | Hits: 55%/20285
  • 🟨 libcudacxx: Pass: 54%/48 | Total: 10h 05m | Avg: 12m 37s | Max: 30m 54s | Hits: 87%/7581

    🟨 ctk
      🟥 11.1               Pass:   0%/7   | Total:  1h 40m | Avg: 14m 18s | Max: 24m 55s
      🟩 12.5               Pass: 100%/2   | Total: 24m 13s | Avg: 12m 06s | Max: 12m 23s
      🟨 12.6               Pass:  61%/39  | Total:  8h 01m | Avg: 12m 20s | Max: 30m 54s | Hits:  87%/7581  
    🟨 cudacxx
      🟥 ClangCUDA18        Pass:   0%/4   | Total:  1h 05m | Avg: 16m 17s | Max: 20m 23s
      🟥 nvcc11.1           Pass:   0%/7   | Total:  1h 40m | Avg: 14m 18s | Max: 24m 55s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 24m 13s | Avg: 12m 06s | Max: 12m 23s
      🟨 nvcc12.6           Pass:  68%/35  | Total:  6h 56m | Avg: 11m 53s | Max: 30m 54s | Hits:  87%/7581  
    🟨 cxx
      🟥 Clang9             Pass:   0%/4   | Total: 46m 31s | Avg: 11m 37s | Max: 18m 36s
      🟩 Clang10            Pass: 100%/1   | Total:  8m 11s | Avg:  8m 11s | Max:  8m 11s
      🟩 Clang11            Pass: 100%/1   | Total:  7m 00s | Avg:  7m 00s | Max:  7m 00s
      🟩 Clang12            Pass: 100%/1   | Total:  6m 59s | Avg:  6m 59s | Max:  6m 59s
      🟩 Clang13            Pass: 100%/1   | Total:  6m 55s | Avg:  6m 55s | Max:  6m 55s
      🟩 Clang14            Pass: 100%/1   | Total:  7m 35s | Avg:  7m 35s | Max:  7m 35s
      🟩 Clang15            Pass: 100%/1   | Total:  7m 30s | Avg:  7m 30s | Max:  7m 30s
      🟩 Clang16            Pass: 100%/1   | Total:  7m 17s | Avg:  7m 17s | Max:  7m 17s
      🟩 Clang17            Pass: 100%/1   | Total:  7m 35s | Avg:  7m 35s | Max:  7m 35s
      🟨 Clang18            Pass:  25%/8   | Total:  1h 53m | Avg: 14m 09s | Max: 25m 23s
      🟥 GCC6               Pass:   0%/2   | Total: 24m 10s | Avg: 12m 05s | Max: 19m 29s
      🟥 GCC7               Pass:   0%/2   | Total: 16m 54s | Avg:  8m 27s | Max: 11m 12s
      🟥 GCC8               Pass:   0%/1   | Total:  6m 20s | Avg:  6m 20s | Max:  6m 20s
      🟥 GCC9               Pass:   0%/3   | Total: 32m 03s | Avg: 10m 41s | Max: 18m 31s
      🟥 GCC10              Pass:   0%/1   | Total:  7m 58s | Avg:  7m 58s | Max:  7m 58s
      🟩 GCC11              Pass: 100%/1   | Total:  7m 06s | Avg:  7m 06s | Max:  7m 06s
      🟩 GCC12              Pass: 100%/1   | Total:  7m 55s | Avg:  7m 55s | Max:  7m 55s
      🟨 GCC13              Pass:  90%/10  | Total:  2h 46m | Avg: 16m 41s | Max: 30m 54s
      🟥 Intel2023.2.0      Pass:   0%/1   | Total:  8m 57s | Avg:  8m 57s | Max:  8m 57s
      🟥 MSVC14.16          Pass:   0%/1   | Total: 24m 55s | Avg: 24m 55s | Max: 24m 55s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 19m 13s | Avg: 19m 13s | Max: 19m 13s | Hits:  87%/2478  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 40m 28s | Avg: 20m 14s | Max: 21m 54s | Hits:  87%/5103  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 24m 13s | Avg: 12m 06s | Max: 12m 23s
    🟨 cxx_family
      🟨 Clang              Pass:  50%/20  | Total:  3h 38m | Avg: 10m 56s | Max: 25m 23s
      🟨 GCC                Pass:  52%/21  | Total:  4h 29m | Avg: 12m 49s | Max: 30m 54s
      🟥 Intel              Pass:   0%/1   | Total:  8m 57s | Avg:  8m 57s | Max:  8m 57s
      🟨 MSVC               Pass:  75%/4   | Total:  1h 24m | Avg: 21m 09s | Max: 24m 55s | Hits:  87%/7581  
      🟩 NVHPC              Pass: 100%/2   | Total: 24m 13s | Avg: 12m 06s | Max: 12m 23s
    🟨 jobs
      🟨 Build              Pass:  48%/41  | Total:  7h 22m | Avg: 10m 47s | Max: 24m 55s | Hits:  87%/7581  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 45m | Avg: 26m 27s | Max: 30m 54s
      🟨 Test               Pass:  50%/2   | Total: 55m 40s | Avg: 27m 50s | Max: 30m 17s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
    🟨 gpu
      🟨 v100               Pass:  54%/48  | Total: 10h 05m | Avg: 12m 37s | Max: 30m 54s | Hits:  87%/7581  
    🟨 cpu
      🟨 amd64              Pass:  56%/46  | Total:  9h 52m | Avg: 12m 53s | Max: 30m 54s | Hits:  87%/7581  
      🟥 arm64              Pass:   0%/2   | Total: 13m 00s | Avg:  6m 30s | Max:  6m 36s
    🟨 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/4   | Total:  1h 05m | Avg: 16m 17s | Max: 20m 23s
      🟨 nvcc               Pass:  59%/44  | Total:  9h 00m | Avg: 12m 17s | Max: 30m 54s | Hits:  87%/7581  
    🟨 sm
      🟥 90                 Pass:   0%/1   | Total: 12m 23s | Avg: 12m 23s | Max: 12m 23s
      🟨 90a                Pass:  50%/2   | Total: 18m 18s | Avg:  9m 09s | Max: 12m 05s
    🟨 std
      🟨 11                 Pass:  16%/6   | Total:  1h 38m | Avg: 16m 28s | Max: 19m 29s
      🟨 14                 Pass:  20%/5   | Total:  1h 09m | Avg: 13m 50s | Max: 25m 32s
      🟨 17                 Pass:  53%/13  | Total:  2h 41m | Avg: 12m 23s | Max: 30m 54s | Hits:  87%/4956  
      🟨 20                 Pass:  69%/23  | Total:  4h 34m | Avg: 11m 56s | Max: 30m 47s | Hits:  87%/2625  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 16h | Avg: 51m 39s | Max: 1h 12m | Hits: 19%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 14h | Avg: 51m 09s | Max:  1h 12m | Hits:  19%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 05m
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 53m | Avg: 50m 29s | Max: 59m 25s | Hits:  20%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
      🟩 12.6               Pass: 100%/38  | Total:  1d 08h | Avg: 50m 56s | Max:  1h 08m | Hits:  19%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 03m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 53m | Avg: 50m 29s | Max: 59m 25s | Hits:  20%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 06h | Avg: 50m 20s | Max:  1h 08m | Hits:  19%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 03m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 14h | Avg: 51m 12s | Max:  1h 12m | Hits:  19%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 35m | Avg: 53m 58s | Max: 58m 49s
      🟩 Clang10            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang11            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang12            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang13            Pass: 100%/1   | Total: 59m 49s | Avg: 59m 49s | Max: 59m 49s
      🟩 Clang14            Pass: 100%/1   | Total: 54m 43s | Avg: 54m 43s | Max: 54m 43s
      🟩 Clang15            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang16            Pass: 100%/1   | Total: 54m 19s | Avg: 54m 19s | Max: 54m 19s
      🟩 Clang17            Pass: 100%/1   | Total: 57m 21s | Avg: 57m 21s | Max: 57m 21s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 42m | Avg: 48m 56s | Max:  1h 05m
      🟩 GCC6               Pass: 100%/2   | Total:  1h 36m | Avg: 48m 16s | Max: 50m 47s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 56m | Avg: 58m 28s | Max: 58m 41s
      🟩 GCC8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 GCC9               Pass: 100%/3   | Total:  2h 34m | Avg: 51m 20s | Max: 57m 21s
      🟩 GCC10              Pass: 100%/1   | Total: 55m 41s | Avg: 55m 41s | Max: 55m 41s
      🟩 GCC11              Pass: 100%/1   | Total: 55m 02s | Avg: 55m 02s | Max: 55m 02s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 46m | Avg: 35m 32s | Max:  1h 03m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 52m | Avg: 36m 35s | Max:  1h 00m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 58m 10s | Avg: 58m 10s | Max: 58m 10s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 59m 25s | Avg: 59m 25s | Max: 59m 25s | Hits:  20%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m | Hits:  19%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 07m | Hits:  19%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 17h 10m | Avg: 54m 13s | Max:  1h 05m
      🟩 GCC                Pass: 100%/21  | Total: 15h 37m | Avg: 44m 39s | Max:  1h 03m
      🟩 Intel              Pass: 100%/1   | Total: 58m 10s | Avg: 58m 10s | Max: 58m 10s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 23m | Avg:  1h 05m | Max:  1h 08m | Hits:  19%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 43m 06s | Avg: 21m 33s | Max: 27m 04s
      🟩 v100               Pass: 100%/45  | Total:  1d 15h | Avg: 53m 00s | Max:  1h 12m | Hits:  19%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 14h | Avg: 57m 14s | Max:  1h 12m | Hits:  19%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 00s | Avg: 19m 00s | Max: 19m 00s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 13s | Avg: 16m 13s | Max: 16m 13s
      🟩 HostLaunch         Pass: 100%/3   | Total: 57m 42s | Avg: 19m 14s | Max: 22m 10s
      🟩 TestGPU            Pass: 100%/2   | Total: 45m 49s | Avg: 22m 54s | Max: 24m 17s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 43m 06s | Avg: 21m 33s | Max: 27m 04s
      🟩 90a                Pass: 100%/1   | Total: 29m 05s | Avg: 29m 05s | Max: 29m 05s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  4h 22m | Avg: 52m 33s | Max: 58m 49s
      🟩 14                 Pass: 100%/4   | Total:  3h 44m | Avg: 56m 10s | Max: 59m 25s | Hits:  20%/783   
      🟩 17                 Pass: 100%/12  | Total: 11h 56m | Avg: 59m 43s | Max:  1h 08m | Hits:  19%/1566  
      🟩 20                 Pass: 100%/26  | Total: 20h 24m | Avg: 47m 04s | Max:  1h 12m | Hits:  18%/783   
    
  • 🟩 thrust: Pass: 100%/46 | Total: 1d 07h | Avg: 40m 46s | Max: 1h 16m | Hits: 39%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total:  1h 21m | Avg: 40m 42s | Max: 45m 48s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  1d 06h | Avg: 40m 56s | Max:  1h 16m | Hits:  39%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 14m | Avg: 37m 22s | Max: 39m 36s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  4h 21m | Avg: 37m 18s | Max:  1h 15m | Hits:  24%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 16m
      🟩 12.6               Pass: 100%/37  | Total:  1d 00h | Avg: 39m 30s | Max:  1h 16m | Hits:  43%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 12m | Avg: 36m 02s | Max: 36m 42s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  4h 21m | Avg: 37m 18s | Max:  1h 15m | Hits:  24%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 16m
      🟩 nvcc12.6           Pass: 100%/35  | Total: 23h 09m | Avg: 39m 42s | Max:  1h 16m | Hits:  43%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 12m | Avg: 36m 02s | Max: 36m 42s
      🟩 nvcc               Pass: 100%/44  | Total:  1d 06h | Avg: 40m 59s | Max:  1h 16m | Hits:  39%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  2h 12m | Avg: 33m 05s | Max: 39m 32s
      🟩 Clang10            Pass: 100%/1   | Total: 43m 17s | Avg: 43m 17s | Max: 43m 17s
      🟩 Clang11            Pass: 100%/1   | Total: 40m 01s | Avg: 40m 01s | Max: 40m 01s
      🟩 Clang12            Pass: 100%/1   | Total: 36m 27s | Avg: 36m 27s | Max: 36m 27s
      🟩 Clang13            Pass: 100%/1   | Total: 39m 08s | Avg: 39m 08s | Max: 39m 08s
      🟩 Clang14            Pass: 100%/1   | Total: 41m 57s | Avg: 41m 57s | Max: 41m 57s
      🟩 Clang15            Pass: 100%/1   | Total: 41m 10s | Avg: 41m 10s | Max: 41m 10s
      🟩 Clang16            Pass: 100%/1   | Total: 38m 13s | Avg: 38m 13s | Max: 38m 13s
      🟩 Clang17            Pass: 100%/1   | Total: 42m 43s | Avg: 42m 43s | Max: 42m 43s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 33m | Avg: 30m 30s | Max: 43m 20s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 25s | Max: 32m 26s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 28s | Max: 36m 31s
      🟩 GCC8               Pass: 100%/1   | Total: 39m 19s | Avg: 39m 19s | Max: 39m 19s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 39m | Avg: 33m 13s | Max: 37m 36s
      🟩 GCC10              Pass: 100%/1   | Total: 40m 46s | Avg: 40m 46s | Max: 40m 46s
      🟩 GCC11              Pass: 100%/1   | Total: 40m 29s | Avg: 40m 29s | Max: 40m 29s
      🟩 GCC12              Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 43m | Avg: 35m 28s | Max: 46m 24s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 50m 51s | Avg: 50m 51s | Max: 50m 51s
      🟩 MSVC14.16          Pass: 100%/1   | Total:  1h 15m | Avg:  1h 15m | Max:  1h 15m | Hits:  24%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 13m | Avg:  1h 13m | Max:  1h 13m | Hits:  24%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 56m | Avg: 58m 54s | Max:  1h 16m | Hits:  49%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 16m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 11h 08m | Avg: 35m 12s | Max: 43m 20s
      🟩 GCC                Pass: 100%/19  | Total: 11h 18m | Avg: 35m 41s | Max: 46m 24s
      🟩 Intel              Pass: 100%/1   | Total: 50m 51s | Avg: 50m 51s | Max: 50m 51s
      🟩 MSVC               Pass: 100%/5   | Total:  5h 25m | Avg:  1h 05m | Max:  1h 16m | Hits:  39%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 16m
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  1d 07h | Avg: 40m 46s | Max:  1h 16m | Hits:  39%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 04h | Avg: 43m 10s | Max:  1h 16m | Hits:  24%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 39m 56s | Avg: 13m 18s | Max: 24m 01s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 48m | Avg: 36m 19s | Max: 46m 24s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 29m 29s | Avg: 29m 29s | Max: 29m 29s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 26m | Avg: 29m 14s | Max: 31m 56s
      🟩 14                 Pass: 100%/4   | Total:  3h 03m | Avg: 45m 57s | Max:  1h 15m | Hits:  24%/1852  
      🟩 17                 Pass: 100%/12  | Total:  9h 36m | Avg: 48m 04s | Max:  1h 16m | Hits:  24%/3704  
      🟩 20                 Pass: 100%/23  | Total: 14h 47m | Avg: 38m 35s | Max:  1h 16m | Hits:  62%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 26m | Avg: 5m 37s | Max: 18m 09s | Hits: 90%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 11m | Avg:  5m 59s | Max: 18m 09s | Hits:  90%/312   
      🟩 arm64              Pass: 100%/4   | Total: 14m 15s | Avg:  3m 33s | Max:  3m 44s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 42s | Avg:  5m 54s | Max: 10m 25s | Hits:  90%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 46s | Avg:  6m 23s | Max:  6m 35s
      🟩 12.6               Pass: 100%/21  | Total:  1h 55m | Avg:  5m 30s | Max: 18m 09s | Hits:  90%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 42s | Avg:  5m 54s | Max: 10m 25s | Hits:  90%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 46s | Avg:  6m 23s | Max:  6m 35s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 55m | Avg:  5m 30s | Max: 18m 09s | Hits:  90%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 26m | Avg:  5m 37s | Max: 18m 09s | Hits:  90%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 27s | Avg:  4m 27s | Max:  4m 27s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 59s | Avg:  3m 59s | Max:  3m 59s
      🟩 Clang18            Pass: 100%/4   | Total: 28m 35s | Avg:  7m 08s | Max: 17m 23s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 13s | Avg:  4m 13s | Max:  4m 13s
      🟩 GCC12              Pass: 100%/2   | Total: 22m 10s | Avg: 11m 05s | Max: 18m 09s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 52s | Avg:  3m 28s | Max:  3m 44s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 25s | Avg: 10m 25s | Max: 10m 25s | Hits:  90%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 57s | Avg:  9m 57s | Max:  9m 57s | Hits:  90%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 46s | Avg:  6m 23s | Max:  6m 35s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 05m | Avg:  5m 00s | Max: 17m 23s
      🟩 GCC                Pass: 100%/9   | Total: 47m 52s | Avg:  5m 19s | Max: 18m 09s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 22s | Avg: 10m 11s | Max: 10m 25s | Hits:  90%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 46s | Avg:  6m 23s | Max:  6m 35s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 26m | Avg:  5m 37s | Max: 18m 09s | Hits:  90%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 50m | Avg:  4m 36s | Max: 10m 25s | Hits:  90%/312   
      🟩 Test               Pass: 100%/2   | Total: 35m 32s | Avg: 17m 46s | Max: 18m 09s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 15s | Avg:  3m 15s | Max:  3m 15s
      🟩 90a                Pass: 100%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 24m 06s | Avg:  4m 01s | Max:  6m 35s
      🟩 20                 Pass: 100%/20  | Total:  2h 02m | Avg:  6m 06s | Max: 18m 09s | Hits:  90%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 24s | Avg: 4m 42s | Max: 7m 10s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 24s | Avg:  4m 42s | Max:  7m 10s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 14s | Avg:  2m 14s | Max:  2m 14s
      🟩 Test               Pass: 100%/1   | Total:  7m 10s | Avg:  7m 10s | Max:  7m 10s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 37s | Avg: 26m 37s | Max: 26m 37s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 170)

# Runner
125 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
15 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

3 participants