[infra] Run parameterized ONNX model tests across CPU, Vulkan, and HI…

…P. (#19524) This switches from running ONNX model compile->run correctness tests on only CPU to now run on GPU using the Vulkan and HIP APIs. We could also run on CUDA with #18814 and Metal with #18817. These new tests will help guard against regressions to full models, at least when using default flags. I'm planning on adding models coming from other frameworks (such as [LiteRT Models](https://github.com/iree-org/iree-test-suites/tree/main/litert_models)) in future PRs. As these tests will run on every pull request and commit, I'm starting the test list with all tests that are passing on our current set of runners, with no (strict _or_ loose) XFAILs. The full set of tests will be run nightly in https://github.com/iree-org/iree-test-suites using nightly IREE releases... once we have runners with GPUs available in that repository. See also iree-org/iree-test-suites#65 and iree-org/iree-test-suites#6. ## Sample logs I have not done much triage on the test failures, but it does seem like Vulkan pass rates are substantially lower than CPU and ROCm. Test reports, including logs for all failures, are currently published as artifacts on actions runs in iree-test-suites, such as https://github.com/iree-org/iree-test-suites/actions/runs/12794322266. We could also archive test reports somewhere like https://github.com/nod-ai/e2eshark-reports and/or host the test reports on a website like https://nod-ai.github.io/shark-ai/llm/sglang/index.html?sort=result. ### CPU https://github.com/iree-org/iree/actions/runs/12797886622/job/35681117085?pr=19524#step:8:395 ``` ============================== slowest durations =============================== 39.46s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[vgg/model/vgg19-7.onnx] 13.39s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx] 13.25s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx] 12.48s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx] 11.93s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx] 11.49s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx] 11.28s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[densenet-121/model/densenet-12.onnx] 11.26s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx] 9.14s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/inception_v2/model/inception-v2-9.onnx] 7.73s call tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/age_googlenet.onnx] 7.61s call tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/gender_googlenet.onnx] 7.57s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[efficientnet-lite4/model/efficientnet-lite4-11.onnx] 7.27s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx] 4.86s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx] 4.61s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-v2-12.onnx] 4.58s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-9.onnx] 3.08s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[squeezenet/model/squeezenet1.0-9.onnx] 2.02s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx] 1.90s call tests/model_zoo/validated/vision/super_resolution_models_test.py::test_models[sub_pixel_cnn_2016/model/super-resolution-10.onnx] ================== 19 passed, 18 skipped in 184.96s (0:03:04) ================== ``` ### ROCm https://github.com/iree-org/iree/actions/runs/12797886622/job/35681117629?pr=19524#step:8:344 ``` ============================== slowest durations =============================== 9.40s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[densenet-121/model/densenet-12.onnx] 9.15s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx] 9.05s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx] 8.73s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx] 7.95s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/inception_v2/model/inception-v2-9.onnx] 7.94s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx] 7.81s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx] 7.13s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx] 6.95s call tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/age_googlenet.onnx] 5.15s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[efficientnet-lite4/model/efficientnet-lite4-11.onnx] 4.52s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/googlenet/model/googlenet-12.onnx] 3.55s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx] 3.12s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-v2-12.onnx] 2.57s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx] 2.48s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-9.onnx] 2.21s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[ssd-mobilenetv1/model/ssd_mobilenet_v1_12.onnx] 1.36s call tests/model_zoo/validated/vision/super_resolution_models_test.py::test_models[sub_pixel_cnn_2016/model/super-resolution-10.onnx] 0.95s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx] ============ 17 passed, 19 skipped, 1 xfailed in 100.10s (0:01:40) ============= ``` ### Vulkan https://github.com/iree-org/iree/actions/runs/12797886622/job/35681118044?pr=19524#step:8:216 ``` ============================== slowest durations =============================== 13.10s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx] 12.97s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx] 12.40s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx] 12.22s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx] 9.07s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx] 8.09s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx] 6.04s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx] 2.93s call tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[ssd-mobilenetv1/model/ssd_mobilenet_v1_12.onnx] 1.86s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx] 0.90s call tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx] ============= 9 passed, 27 skipped, 1 xfailed in 79.62s (0:01:19) ============== ``` ci-exactly: build_packages, test_onnx
iree-org · Jan 31, 2025 · 10e66bc · 10e66bc
1 parent 36e7593
commit 10e66bc
Show file tree

Hide file tree

Showing 7 changed files with 171 additions and 19 deletions.
diff --git a/.github/workflows/pkgci_test_onnx.yml b/.github/workflows/pkgci_test_onnx.yml
@@ -90,7 +90,7 @@ jobs:
         uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
         with:
           repository: iree-org/iree-test-suites
-          ref: 8e6af9e75d874ef8c9f8ff55f12cb38157dd55eb
+          ref: 56509666d185cdb99aafb4ed4d8532b8f3c54fa3
           path: iree-test-suites
       - name: Install ONNX ops test suite requirements
         run: |
@@ -130,11 +130,26 @@ jobs:
         include:
           # CPU
           - name: cpu_llvm_task
-            runs-on: ubuntu-24.04
+            config-file: onnx_models_cpu_llvm_task.json
+            runs-on:
+              - self-hosted # must come first
+              - persistent-cache
+              - Linux
+              - X64
 
-          # TODO(scotttodd): test other backends (parameterize the test suite)
+          # AMD GPU
+          - name: amdgpu_rocm_rdna3
+            config-file: onnx_models_gpu_rocm_rdna3.json
+            runs-on: nodai-amdgpu-w7900-x86-64
+          - name: amdgpu_vulkan
+            config-file: onnx_models_gpu_vulkan.json
+            runs-on: nodai-amdgpu-w7900-x86-64
+
+          # NVIDIA GPU
+          # TODO(#18238): migrate to new runner cluster
     env:
       VENV_DIR: ${{ github.workspace }}/venv
+      CONFIG_FILE_PATH: tests/external/iree-test-suites/onnx_models/${{ matrix.config-file }}
     steps:
       - name: Checking out IREE repository
         uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
@@ -158,7 +173,7 @@ jobs:
         uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
         with:
           repository: iree-org/iree-test-suites
-          ref: 8e6af9e75d874ef8c9f8ff55f12cb38157dd55eb
+          ref: 56509666d185cdb99aafb4ed4d8532b8f3c54fa3
           path: iree-test-suites
       - name: Install ONNX models test suite requirements
         run: |
@@ -170,6 +185,6 @@ jobs:
           pytest iree-test-suites/onnx_models/ \
               -rA \
               --log-cli-level=info \
-              --override-ini=xfail_strict=false \
               --timeout=120 \
-              --durations=0
+              --durations=0 \
+              --test-config-file=${CONFIG_FILE_PATH}
diff --git a/.github/workflows/pkgci_test_sharktank.yml b/.github/workflows/pkgci_test_sharktank.yml
@@ -60,7 +60,7 @@ jobs:
         uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
         with:
           repository: iree-org/iree-test-suites
-          ref: c47d13c0730a51beed3bef6128e7a61a80f85ce9
+          ref: 56509666d185cdb99aafb4ed4d8532b8f3c54fa3
           path: iree-test-suites
           lfs: true
       - name: Install Sharktank models test suite requirements

diff --git a/docs/website/docs/assets/stylesheets/iree.css b/docs/website/docs/assets/stylesheets/iree.css
@@ -17,6 +17,10 @@
   top: 5px;
   left: 0;
   display: inline;
+
+  /* Override other style settings based on header level */
+  font-size: .75rem;
+  text-transform: none;
 }
 
 pre.highlight code {
@@ -72,3 +76,8 @@ pre.highlight code {
   -webkit-mask-image: var(--md-admonition-icon--danger);
   mask-image: var(--md-admonition-icon--danger);
 }
+
+/* Don't convert h5 text to uppercase  */
+.md-typeset h5 {
+  text-transform: none;
+}
diff --git a/docs/website/docs/developers/general/testing-guide.md b/docs/website/docs/developers/general/testing-guide.md
@@ -396,6 +396,8 @@ not supported by Bazel rules at this point.
 
 ## External test suites
 
+### iree-test-suites
+
 Multiple test suites are under development in the
 [iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites)
 repository.
@@ -407,12 +409,7 @@ repository.
 * Keeping tests out of tree forces them to use public project APIs and allows
   the core project to keep its infrastructure simpler.
 
-The [nod-ai/SHARK-TestSuite](https://github.com/nod-ai/SHARK-TestSuite)
-repository also contains tests for many machine learning models. Some of these
-tests are planned to be migrated into
-[iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites).
-
-### linalg operator tests
+#### linalg operator tests
 
 Tests for operators in the MLIR linalg dialect like `matmul`, and `convolution`
 are being migrated from folders like
@@ -424,7 +421,7 @@ in the
 [iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites)
 repository.
 
-### ONNX operator tests
+#### :simple-onnx: ONNX operator tests
 
 Tests for individual ONNX operators are included at
 [`onnx_ops/`](https://github.com/iree-org/iree-test-suites/tree/main/onnx_ops)
@@ -438,7 +435,7 @@ Testing ONNX programs follows several stages:
 
 ```mermaid
 graph LR
-  Import -. "<br>(offline)" .-> Compile
+  Import -. "(offline)" .-> Compile
   Compile --> Run
 ```
 
@@ -469,15 +466,15 @@ To run slices of the test suite, a [pytest](https://docs.pytest.org/en/stable/)
 runner is included that can be configured using JSON files. The JSON files
 tested in the IREE repo itself are stored in
 [`tests/external/iree-test-suites/onnx_ops/`](https://github.com/iree-org/iree/tree/main/tests/external/iree-test-suites/onnx_ops).
-
-For example, here is part of a config file for running ONNX tests on CPU:
+For example, here is part of a config file for running ONNX operator tests on
+CPU:
 
 <!-- markdownlint-disable-next-line -->
 ```json title="tests/external/iree-test-suites/onnx_ops/onnx_ops_cpu_llvm_sync.json" linenums="1"
 --8<-- "tests/external/iree-test-suites/onnx_ops/onnx_ops_cpu_llvm_sync.json::20"
 ```
 
-#### Updating config files
+##### Updating config files
 
 If the ONNX operator tests fail on a GitHub Actions workflow, check the logs
 for the nature of the failure. Often, a test is *newly passing*, with logs
@@ -496,11 +493,59 @@ committed:
 
 ![image](https://github.com/user-attachments/assets/b5dbdcb4-4c0a-4ff2-adc6-9021614179b2)
 
-### ONNX model tests
+#### :simple-onnx: ONNX model tests
 
 Tests for ONNX models are included at
 [`onnx_models/`](https://github.com/iree-org/iree-test-suites/tree/main/onnx_models)
 in the
 [iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites)
 repository. These tests use models from the upstream
 [onnx/models](https://github.com/onnx/models) repository.
+
+Like the ONNX operator tests, the ONNX model tests use configuration files to
+control which flags are used and which tests are run. The config files tested
+in the IREE repo itself are stored in
+[`tests/external/iree-test-suites/onnx_models/`](https://github.com/iree-org/iree/tree/main/tests/external/iree-test-suites/onnx_models).
+For example, here is part of a config file for running ONNX model tests on CPU:
+
+<!-- markdownlint-disable-next-line -->
+```json title="tests/external/iree-test-suites/onnx_models/onnx_models_cpu_llvm_task.json" linenums="1"
+--8<-- "tests/external/iree-test-suites/onnx_models/onnx_models_cpu_llvm_task.json::14"
+```
+
+Unlike the ONNX operator tests, we do not run the full set of tests on every
+commit to [iree-org/iree](https://github.com/iree-org/iree). Instead, we run a
+curated list of small tests that are expected to pass in
+[iree-org/iree](https://github.com/iree-org/iree) and then run the full set of
+tests nightly in
+[iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites).
+
+#### sharktank tests
+
+Tests for small scale versions of Large Language Models (LLMs)
+and other Generative AI (GenAI) programs exported using the
+[sharktank package](https://github.com/nod-ai/shark-ai/tree/main/sharktank)
+built as part of the [shark-ai project](https://github.com/nod-ai/shark-ai) are
+included at
+[`sharktank_models/`](https://github.com/iree-org/iree-test-suites/tree/main/sharktank_models)
+in the
+[iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites)
+repository.
+
+<!-- TODO(scotttodd): document how to coordinate changes across these projects -->
+
+### SHARK-TestSuite
+
+The [nod-ai/SHARK-TestSuite](https://github.com/nod-ai/SHARK-TestSuite)
+repository also contains tests using IREE,
+[llvm/torch-mlir](https://github.com/llvm/torch-mlir), and
+[nod-ai/shark-ai](https://github.com/nod-ai/shark-ai).
+
+Some test coverage may overlap between SHARK-TestSuite and iree-test-suites,
+though some tests are planned to be migrated into
+[iree-org/iree-test-suites](https://github.com/iree-org/iree-test-suites) once
+they mature and have demonstrated general utility to the upstream developer
+community.
+
+Test reports for nightly runs in SHARK-TestSuite are uploaded to
+[nod-ai/e2eshark-reports](https://github.com/nod-ai/e2eshark-reports).
diff --git a/tests/external/iree-test-suites/onnx_models/onnx_models_cpu_llvm_task.json b/tests/external/iree-test-suites/onnx_models/onnx_models_cpu_llvm_task.json
@@ -0,0 +1,32 @@
+{
+  "config_name": "cpu_llvm_task",
+  "iree_compile_flags": [
+    "--iree-hal-target-backends=llvm-cpu",
+    "--iree-llvmcpu-target-cpu=host"
+  ],
+  "iree_run_module_flags": [
+    "--device=local-task"
+  ],
+  "tests_and_expected_outcomes": {
+    "default": "skip",
+    "tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/age_googlenet.onnx]": "pass",
+    "tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/gender_googlenet.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[densenet-121/model/densenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[efficientnet-lite4/model/efficientnet-lite4-11.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/inception_v2/model/inception-v2-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-v2-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[squeezenet/model/squeezenet1.0-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[vgg/model/vgg19-7.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/super_resolution_models_test.py::test_models[sub_pixel_cnn_2016/model/super-resolution-10.onnx]": "pass"
+  }
+}
diff --git a/tests/external/iree-test-suites/onnx_models/onnx_models_gpu_rocm_rdna3.json b/tests/external/iree-test-suites/onnx_models/onnx_models_gpu_rocm_rdna3.json
@@ -0,0 +1,30 @@
+{
+  "config_name": "gpu_rocm_rdna3",
+  "iree_compile_flags": [
+    "--iree-hal-target-backends=rocm",
+    "--iree-hip-target=gfx1100"
+  ],
+  "iree_run_module_flags": [
+    "--device=hip"
+  ],
+  "tests_and_expected_outcomes": {
+    "default": "skip",
+    "tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/age_googlenet.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[densenet-121/model/densenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[efficientnet-lite4/model/efficientnet-lite4-11.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/googlenet/model/googlenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[inception_and_googlenet/inception_v2/model/inception-v2-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[shufflenet/model/shufflenet-v2-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/super_resolution_models_test.py::test_models[sub_pixel_cnn_2016/model/super-resolution-10.onnx]": "pass"
+  }
+}
diff --git a/tests/external/iree-test-suites/onnx_models/onnx_models_gpu_vulkan.json b/tests/external/iree-test-suites/onnx_models/onnx_models_gpu_vulkan.json
@@ -0,0 +1,21 @@
+{
+  "config_name": "gpu_vulkan",
+  "iree_compile_flags": [
+    "--iree-hal-target-backends=vulkan-spirv"
+  ],
+  "iree_run_module_flags": [
+    "--device=vulkan"
+  ],
+  "tests_and_expected_outcomes": {
+    "default": "skip",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[alexnet/model/bvlcalexnet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[caffenet/model/caffenet-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mnist/model/mnist-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[mobilenet/model/mobilenetv2-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[rcnn_ilsvrc13/model/rcnn-ilsvrc13-9.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v1-12.onnx]": "pass",
+    "tests/model_zoo/validated/vision/classification_models_test.py::test_models[resnet/model/resnet50-v2-7.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[tiny-yolov2/model/tinyyolov2-8.onnx]": "pass",
+    "tests/model_zoo/validated/vision/object_detection_segmentation_models_test.py::test_models[yolov2-coco/model/yolov2-coco-9.onnx]": "pass"
+  }
+}