You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The function seems to compute the multiplication in the wrong order, i just assume that from my tests. The fix should be:
"__kernel void gpu_matrix_mult(const int M, const int N, const int K, const __global float* A, const __global float* B, __global float* C){"" const int globalRow = get_global_id(0);"" const int globalCol = get_global_id(1);"" float acc = 0;"" for (int i=0; i<N; ++i){"" acc += A[globalRow*N + i] * B[i*K + globalCol];"" }"" C[globalRow*K + globalCol] = acc;""}";
Computing the multiplication with CPU and/or just testing it with SIZE 4/8 printing A B and C matrices and do some hand calculating, it should show a different result. The function i've used:
voidcpu_matrix_mult(constfloat* const c_a, constfloat* const c_b, float* c_c, constint m, constint n, constint k){
float sum = 0;
for (int i = 0; i < m; ++i){
for (int j = 0; j < k; ++j){
sum = 0;
for (int h = 0; h < n; ++h){
sum += c_a[i * n + h] * c_b[h * k + j];
}
c_c[i * k + j] = sum;
}
}
}
The text was updated successfully, but these errors were encountered:
We assume data to be stored in column-major format (Fortran-style), following cuBLAS's default. If we wanted, we could easily change this to row-major by swapping the A and B matrices and the N and M constants, so this is not a real limitation of our code.
https://github.com/CNugteren/myGEMM/blame/e2a364537f2b8725b3f5ba5f81008d04558a2327/extra/minimal.cpp#L39
The function seems to compute the multiplication in the wrong order, i just assume that from my tests. The fix should be:
Computing the multiplication with CPU and/or just testing it with SIZE 4/8 printing A B and C matrices and do some hand calculating, it should show a different result. The function i've used:
The text was updated successfully, but these errors were encountered: