[GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default #19520

nirvedhmeshram · 2024-12-18T20:52:49Z

Based on comparisons with iree-kernel-benchmark here The performance between VectorDistribute vs TileAndFuse when using intrinisics seem comparable. Note that none of the tests in the sheet used the padding extension available in TileAndFuse after, #19484
so its a fair comparison of the pipelines themselves. TileAndFuse in some cases did have a speed up that seems beyond the noise level and overall it averages out to 1.25x faster.

However, we will be looking at LLAMA and SDXL numbers before actually considering this PR for merging,

Fixes : #18858

nirvedhmeshram · 2024-12-19T16:26:57Z

There are compiler failures in the regression suite models, converting to draft while I debug

nirvedhmeshram · 2024-12-19T21:49:28Z

The problem was a missing functionality for GEMMs of the type (f16,f16) ->f16. I filed this issue for it
#19532
Probably cant land this without having a solution for that but we also solved this problem at the model level so going to keep pushing on this to find other issues.

nirvedhmeshram · 2024-12-20T19:42:15Z

Found another issue with accumulating GEMMs #19546

nirvedhmeshram · 2025-01-06T23:22:32Z

Also need to disable prefetching when using c promotion due to this issue #19612

Signed-off-by: Nirvedh <[email protected]> Signed-off-by: Nirvedh Meshram <[email protected]>

Signed-off-by: Nirvedh Meshram <[email protected]>

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch from 38f5a22 to 7d687d7 Compare December 18, 2024 22:21

nirvedhmeshram changed the title ~~[GPU] Enable GEMMs to use LLVMGPUTileAndFuse by default~~ [GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default Dec 18, 2024

nirvedhmeshram marked this pull request as ready for review December 18, 2024 22:35

nirvedhmeshram requested review from MaheshRavishankar, qedawkins, kuhar and Groverkss as code owners December 18, 2024 22:35

nirvedhmeshram marked this pull request as draft December 19, 2024 16:26

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch from 7d687d7 to 7e2cdf8 Compare December 19, 2024 21:46

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch 2 times, most recently from e6aa895 to 3bc822c Compare December 20, 2024 16:41

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch 6 times, most recently from 2adc85d to 2111358 Compare January 6, 2025 23:19

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch from 210ef2a to 017e558 Compare January 7, 2025 18:39

nirvedhmeshram added 2 commits January 8, 2025 11:14

[GPU] Enable tile and fuse matmul by default

ba95762

Signed-off-by: Nirvedh <[email protected]> Signed-off-by: Nirvedh Meshram <[email protected]>

[DO NOT MERGE] Add Barrier to fix race issue from different layout

982856b

Signed-off-by: Nirvedh Meshram <[email protected]>

nirvedhmeshram force-pushed the enable_tile_and_fuse_matmul branch from 017e558 to 982856b Compare January 8, 2025 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default #19520

[GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default #19520

nirvedhmeshram commented Dec 18, 2024 •

edited

Loading

nirvedhmeshram commented Dec 19, 2024

nirvedhmeshram commented Dec 19, 2024 •

edited

Loading

nirvedhmeshram commented Dec 20, 2024

nirvedhmeshram commented Jan 6, 2025

[GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default #19520

Are you sure you want to change the base?

[GPU] Enable GEMMs to first attempt LLVMGPUTileAndFuse with intrinsic by default #19520

Conversation

nirvedhmeshram commented Dec 18, 2024 • edited Loading

nirvedhmeshram commented Dec 19, 2024

nirvedhmeshram commented Dec 19, 2024 • edited Loading

nirvedhmeshram commented Dec 20, 2024

nirvedhmeshram commented Jan 6, 2025

nirvedhmeshram commented Dec 18, 2024 •

edited

Loading

nirvedhmeshram commented Dec 19, 2024 •

edited

Loading