Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Bindless][E2E] Enable 3-channel image test for Intel GPUs #16537

Merged

Conversation

przemektmalon
Copy link
Contributor

This patch enables the 3-channel image E2E test on Intel GPUs.

@wenju-he
Copy link
Contributor

wenju-he commented Jan 8, 2025

@przemektmalon could you please update test to use ushort format like in following diff

diff --git a/sycl/test-e2e/bindless_images/3_channel_format.cpp b/sycl/test-e2e/bindless_images/3_channel_format.cpp
index a3668f4f3197..4a081c95ed4a 100644
--- a/sycl/test-e2e/bindless_images/3_channel_format.cpp
+++ b/sycl/test-e2e/bindless_images/3_channel_format.cpp
@@ -21,19 +21,19 @@ int main() {
   auto ctxt = q.get_context();

   constexpr size_t width = 512;
-  std::vector<float> out(width);
-  std::vector<float> expected(width);
-  std::vector<sycl::float3> dataIn(width);
-  float exp = 512;
-  for (int i = 0; i < width; i++) {
+  std::vector<unsigned short> out(width);
+  std::vector<unsigned short> expected(width);
+  std::vector<sycl::ushort3> dataIn(width);
+  unsigned short exp = 512;
+  for (unsigned i = 0; i < width; i++) {
     expected[i] = exp;
-    dataIn[i] = sycl::float3(exp, width, i);
+    dataIn[i] = sycl::ushort3(exp, width, i);
   }

   try {
     // Main point of this test is to check creating an image
     // with a 3-channel format
-    syclexp::image_descriptor desc({width}, 3, sycl::image_channel_type::fp32);
+    syclexp::image_descriptor desc({width}, 3, sycl::image_channel_type::unsigned_int16);

     syclexp::image_mem imgMem(desc, dev, ctxt);

@@ -46,7 +46,7 @@ int main() {
     syclexp::unsampled_image_handle imgHandle =
         sycl::ext::oneapi::experimental::create_image(imgMem, desc, dev, ctxt);

-    sycl::buffer<float> buf(out.data(), width);
+    sycl::buffer<unsigned short> buf(out.data(), width);

     q.submit([&](sycl::handler &cgh) {
       sycl::accessor outAcc{buf, cgh};
@@ -55,9 +55,9 @@ int main() {
 #if defined(__SYCL_DEVICE_ONLY__) && defined(__NVPTX__)
         // This shouldn't be hit anyway since CUDA doesn't support
         // 3-channel formats, but we need to ensure the kernel can compile
-        using pixel_t = sycl::float4;
+        using pixel_t = sycl::ushort4;
 #else
-        using pixel_t = sycl::float3;
+        using pixel_t = sycl::ushort3;
 #endif
         auto pixel = syclexp::fetch_image<pixel_t>(imgHandle, int(id[0]));
         outAcc[id] = pixel[0];

Intel GPU currently only supports ushort and uchar format for 3-channels.

@wenju-he
Copy link
Contributor

wenju-he commented Jan 9, 2025

std::vector<sycl::ushort3> dataIn(width); should be changed to std::vector<sycl::ushort> dataIn(width * 3); because sycl::ushort3 has the same size as sycl::ushort4 and intel gpu NEO runtime assumes memory layout of 3-channel image is of 3-component per element. Thanks @AshwinKumarKulkarni for pointing this out.

@przemektmalon
Copy link
Contributor Author

std::vector<sycl::ushort3> dataIn(width); should be changed to std::vector<sycl::ushort> dataIn(width * 3); because sycl::ushort3 has the same size as sycl::ushort4 and intel gpu NEO runtime assumes memory layout of 3-channel image is of 3-component per element. Thanks @AshwinKumarKulkarni for pointing this out.

Indeed, good catch. Updated to use unsigned short, as sycl::ushort throws compilation errors due to deprecation.

@wenju-he Does this mean the kernel code requires change as well? I.e. when calling syclexp::fetch_image<sycl::ushort3>?

If the Intel driver returns a 6-byte length type, then we might need to use a custom ushort3 struct in the test which satisfies the size requirements, as well as updating the is_data_size_valid() function in bindless_images.hpp here.

@wenju-he
Copy link
Contributor

@wenju-he Does this mean the kernel code requires change as well? I.e. when calling syclexp::fetch_image<sycl::ushort3>?

If the Intel driver returns a 6-byte length type, then we might need to use a custom ushort3 struct in the test which satisfies the size requirements, as well as updating the is_data_size_valid() function in bindless_images.hpp here.

good question, but I don't have a clear answer to it.
Currently there is no correctness issue of using sycl::ushort3 in kernel code because _spirv_ImageRead builtin's return type <3 x i16> has the same alignment, which is 8, as sycl::ushort3 according to LLVM datalayout setting. So it is probably fine to continue use sycl::ushort3 in the kernel code.

@wenju-he
Copy link
Contributor

@intel/bindless-images-reviewers please review, thanks. This test is now passing with latest intel gpu driver.

@przemektmalon
Copy link
Contributor Author

@intel/llvm-gatekeepers I believe this can be merged.

@steffenlarsen steffenlarsen merged commit d70ed19 into intel:sycl Jan 17, 2025
16 checks passed
@steffenlarsen
Copy link
Contributor

@przemektmalon - Test is failing on Arc in post-commit. Could you please address this ASAP or revert?

@ProGTX
Copy link
Contributor

ProGTX commented Jan 17, 2025

@przemektmalon - Test is failing on Arc in post-commit. Could you please address this ASAP or revert?

@wenju-he does the post-commit CI use the latest driver? If not, can we bump it to the latest driver?

@wenju-he
Copy link
Contributor

@ProGTX CI is using latest public gpu driver https://github.com/intel/compute-runtime/releases/tag/24.52.32224.5 in which NEO version is 32224.
32370 the oldest NEO driver that has fix for 3-channel image. So the public gpu driver is older than 32370.
So I think we probably need to wait for public gpu driver uplift. Before that, the test probably fails.

@wenju-he
Copy link
Contributor

wenju-he commented Jan 17, 2025

the bumping is automatic once there is a new public gpu driver. So we have to wait for a new public gpu driver whose version is higher than 32370

@AshwinKumarKulkarni
Copy link
Contributor

It usually takes 3-4 weeks to reach public release/gfx-master. i will try to find the actual schedule of including it in public release and inform here.

@AlexeySachkov
Copy link
Contributor

@ProGTX CI is using latest public gpu driver https://github.com/intel/compute-runtime/releases/tag/24.52.32224.5 in which NEO version is 32224. 32370 the oldest NEO driver that has fix for 3-channel image. So the public gpu driver is older than 32370. So I think we probably need to wait for public gpu driver uplift. Before that, the test probably fails.

We can't have the test failing, but if you would like to preserve the commit, then there is an option to say that the test requires a certain driver version to be present, look for:

// REQUIRES-INTEL-DRIVER: lin: 27501, win: 101.4943

przemektmalon added a commit to codeplaysoftware/intel-llvm-mirror that referenced this pull request Jan 20, 2025
This patch should fix the post-commit failure resulting from enabling
the 3-channel image PR in intel#16537

This is done by adding a `// REQUIRES-INTEL-DRIVER:` comment for LIT to
ignore the test until the necessary driver for the functionality is
introduced to the GitHub CI.
@przemektmalon
Copy link
Contributor Author

We can't have the test failing, but if you would like to preserve the commit, then there is an option to say that the test requires a certain driver version to be present, look for:

// REQUIRES-INTEL-DRIVER: lin: 27501, win: 101.4943

@steffenlarsen @AlexeySachkov
I have created the PR with this update here: #16691

I've used the Linux driver version that @wenju-he mentioned introduces the 3-channel capability. I'm unsure, however, what the corresponding Windows driver version should be.

sarnex pushed a commit that referenced this pull request Jan 21, 2025
This patch should fix the post-commit failure resulting from enabling
the 3-channel image PR in #16537

This is done by adding a `// REQUIRES-INTEL-DRIVER:` comment for LIT to
ignore the test until the necessary driver for the functionality is
introduced to the GitHub CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants