Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop Mac versions of solutions for a few labs #60

Open
dendibakh opened this issue Sep 30, 2022 · 9 comments
Open

Develop Mac versions of solutions for a few labs #60

dendibakh opened this issue Sep 30, 2022 · 9 comments

Comments

@dendibakh
Copy link
Owner

Currently the following labs don't have solutions for Mac M1 platform:

  ["memory_bound"]["huge_pages_1"]          - need to check huge pages on Mac
  ["misc"]["io_opt1"]                       - mmap on Mac
  ["core_bound"]["compiler_intrinsics_1"]   - NEON version
  ["core_bound"]["compiler_intrinsics_2"]   - NEON version

This prevents automated benchmarking of their speedups in CI.

@andrewevstyukhin
Copy link
Collaborator

You can try to use "sse2neon"

@dendibakh
Copy link
Owner Author

You can try to use "sse2neon"

Thanks! Good idea.

@Cosmin-B
Copy link
Collaborator

Hi @dendibakh, I can confirm that using sse2neon to solve the ["core_bound"]["compiler_intrinsics_1"] does work, albeit it is a bit slower than writing pure ARM Neon code, due to the differences in the architectures and the more instructions need to translate from SSE to NEON using the same x86 algorithms.

You can check the CI Job as well as my branch with commits .

Once again, thank you for the excellent work!

@Cosmin-B
Copy link
Collaborator

Cosmin-B commented Dec 11, 2022

Ups, I did not mean to close this. Denis, can you please re-open this?

@dendibakh dendibakh reopened this Dec 13, 2022
@dendibakh
Copy link
Owner Author

Hi @dendibakh, I can confirm that using sse2neon to solve the ["core_bound"]["compiler_intrinsics_1"] does work,

This is nice to know! I haven't used sse2neon before.

albeit it is a bit slower than writing pure ARM Neon code, due to the differences in the architectures and the more instructions need to translate from SSE to NEON using the same x86 algorithms.

I can't find your NEON implementation. Can you please share it?

Once again, thank you for the excellent work!

You're welcome! :)

@Cosmin-B Cosmin-B reopened this Dec 13, 2022
@Cosmin-B
Copy link
Collaborator

Cosmin-B commented Dec 13, 2022

Hey @dendibakh

This is it! implementation. You can look at the history as well as see that I added the sse2neon.h in the compiler_intrinsics_1 folder. You can add that as a dependency that you would pull automatically on ARM devices compatible with Neon.

Additionally, here is the link to the CI job for M1 Mac, I had to enable the CI to run on M1 for this lab, I did that here.

P.S.: I am sorry for temporarily closing the issue again. The GitHub interface is not friendly enough for me.

Kind regards,
Cosmin

@dendibakh
Copy link
Owner Author

I thought you said you wrote NEON instrinsics yourself without using sse2neon library, no?

@Cosmin-B
Copy link
Collaborator

I can confirm that using sse2neon to solve the ["core_bound"]["compiler_intrinsics_1"] does work, albeit it is a bit slower than writing pure ARM Neon code

Hey, @dendibakh, I am sorry for the misunderstanding. However, I did not say that. I only said that I confirmed that it could be done using the sse2neon library.

Kind regards,
Cosmin

@dendibakh
Copy link
Owner Author

I can confirm that using sse2neon to solve the ["core_bound"]["compiler_intrinsics_1"] does work, albeit it is a bit slower than writing pure ARM Neon code

Hey, @dendibakh, I am sorry for the misunderstanding. However, I did not say that. I only said that I confirmed that it could be done using the sse2neon library.

Kind regards, Cosmin

Ok, got it, no worries. Thanks for sharing your experiments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants