Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite align_down with bitwise operations #391

Merged
merged 1 commit into from
Apr 28, 2021

Conversation

jserv
Copy link
Contributor

@jserv jserv commented Apr 21, 2021

mi_align_down_ptr was implemented with multiplication and division,
which can be converted to equivalent and deterministic bit operations.

mi_align_down_ptr was implemented with multiplication and division,
which can be converted to equivalent and deterministic bit operations.
@daanx
Copy link
Collaborator

daanx commented Apr 22, 2021

Thank you for the PR @jserv ! However, I will not merge it, apologies :-( Modern C compilers will always optimize divides and multiplies, and it being static is enough for the compiler to inline if appropiate. Moreover, by using division/multiply it will work also for non-powers of 2 and give better error/warning messages in case of overflow etc.

For example,

uintptr_t testing(uintptr_t sz) {
  return _mi_align_down(sz, 4096);
}

becomes

testing(unsigned long):
        mov     rax, rdi
        and     rax, -4096
        ret

https://godbolt.org/z/zf4frzqqc

@daanx
Copy link
Collaborator

daanx commented Apr 22, 2021

Ah, I see I do do this bit trick in _mi_align_up -- it makes sense of course if the alignment is unknown in which case the compiler cannot optimize more. Still, that only matters for operations where performance is paramount (in this case in the posix_memalign for example). In the os.c module this is not so important -- any OS call will completely dominate run time; so I am still not convinced; let's keep it simple where we can.

@jserv
Copy link
Contributor Author

jserv commented Apr 22, 2021

Compiler optimizations might only make sense when alignment is a constant. Think of the following:

static uintptr_t _mi_align_down(uintptr_t sz, size_t alignment) { return (sz / alignment) * alignment; }
static int page_size;
static int get_page_size() { return page_size; }
uintptr_t testing(uintptr_t sz) {
  page_size = 4096;
  return _mi_align_down(sz, get_page_size());
}

The generated assembly of function _mi_align_down would be: (x86_64, gcc-10.3)

_mi_align_down:
        push    rbp
        mov     rbp, rsp
        mov     QWORD PTR [rbp-8], rdi
        mov     QWORD PTR [rbp-16], rsi
        mov     rax, QWORD PTR [rbp-8]
        mov     edx, 0
        div     QWORD PTR [rbp-16]
        imul    rax, QWORD PTR [rbp-16]
        pop     rbp
        ret

Both div and imul instructions exist.

@jserv
Copy link
Contributor Author

jserv commented Apr 22, 2021

That was the reason why I sent another pull request #392 to enable expected compiler optimizations.

@daanx daanx changed the base branch from master to dev April 28, 2021 19:52
@daanx daanx merged commit 6d16581 into microsoft:dev Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants