-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++20] [Modules] ld.lld: error: undefined symbol: initializer for module #421
Comments
With "Windows version of Clang", I presume you mean Clang running in MSVC mode (not a windows build of e.g. llvm-mingw). If we split up the steps so we get intermediate object files we can inspect, like this:
We see this: $ llvm-nm hello.o
00000000 a @feat.00
00000010 T _ZGIW5hello
00000000 T _ZW5hello1fv
$ llvm-readobj --coff-directives hello.o
File: hello.o
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
Directive(s): -export:_ZW5hello1fv
$ llvm-nm main.o
00000000 a @feat.00
00000020 t _GLOBAL__sub_I_main.cpp
U _ZGIW5hello
U _ZW5hello1fv
U __main
00000000 T main While if we do the same with Clang in MSVC mode, we see the following: $ llvm-nm hello.o
00000000 T ?f@@YAXXZ
00000000 a @feat.00
$ llvm-readobj --coff-directives hello.o
File: hello.o
Format: COFF-x86-64
Arch: x86_64
AddressSize: 64bit
Directive(s): /EXPORT:"?f@@YAXXZ"
$ llvm-nm main.o
U ?f@@YAXXZ
00000000 a @feat.00
00000000 T main So in MSVC mode, the symbol reference to Anyway, to properly solve this issue for mingw mode... When we have a dllexport directive in any embedded file, we only export those symbols, while normally we'd export all symbols. We can add |
This should be a deficiency when clang targeting MSVC ABI. The big three are all choose strong module ownership model. So module names are included in final mangled symbol names, i.e.
Looks like it is specific to GNU ABI, MSVC ABI doesn't have such thing. $ /opt/msvc/bin/x64/cl /std:c++20 /TP /interface /c hello.cppm /Fo:hello.cl.obj -DX=1
$ llvm-nm hello.cl.obj
00000000 T ?f@@YAXXZ::<!hello>
010582ef a @comp.id
80010190 a @feat.00
00000002 a @vol.md GCC also has the same thing, but it is not referenced in $ x86_64-w64-mingw32-g++ -std=c++20 -fmodules-ts -c -x c++ hello.cppm -o hello.gcc.o -DX=1
$ llvm-nm hello.gcc.o
0000000000000007 T _ZGIW5hello
0000000000000000 T _ZW5hello1fv
$ x86_64-w64-mingw32-g++ -std=c++20 -fmodules-ts -c main.cpp -o main.gcc.o
$ llvm-nm main.gcc.o
U _ZW5hello1fv
U __main
00000000 T main
Or |
I opened a upstream issue llvm/llvm-project#89781 regarding the symbol mangling. |
Thanks!
Hmm, won't this mean that there's an Itanium ABI mismatch between these two compilers? If I build one object file with Clang and one with GCC, we might either have an initializer for the module that isn't called, or we might have an undefined reference to the initializer. I get that the actual binary compiled module (PCM) is compiler specific, but aren't object files supposed to be interoperable? Especially, as in the case of the original bug report here, the issue is across a DLL interface, which definitely should be compileable with two different compilers? |
Yes, it is a ABI issue. But I have no idea whether Clang or GCC is correct. Or maybe Itanium ABI is missing specification regarding |
I guess we can start out by filing a bug report on either LLVM's or GCC's bugzilla about it, or both, and whoever responds to it can maybe give guidance towards which side might have the bug, or they can suggest escalating it to https://github.com/itanium-cxx-abi/cxx-abi. |
1. [GCC] initializer for module is hidden with -fvisibility=hidden https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105397 2. [Clang] initializer for module is not dllexported when targeting MinGW mstorsjo/llvm-mingw#421 3. [Clang] not strong ownership model when targeting MSVC ABI llvm/llvm-project#89781 4. [Clang] module variables are not dllimported llvm/llvm-project#87887 5. [DLL] module functions are not dllimported causing indirect function call https://stackoverflow.com/a/74444920 https://developercommunity.visualstudio.com/t/10202062 https://gitlab.kitware.com/cmake/cmake/-/issues/25539
Ok, there is a PR itanium-cxx-abi/cxx-abi#144 for C++20 Modules. The related part is https://html-preview.github.io/?url=https://github.com/urnathan/cxx-abi/blob/main/abi.html#mangling-module-initializer. So, GCC chose to omit call to module-initializer for this trivial example since there is no objects with dynamic-initializers. BTW, GCC currently hides module-initializers if
After understanding the role of |
Ok, so this bit I guess:
Indeed... But how does this spec mean to handle the case when one object decides to omit the module-initializer definition, while the other one decides to not omit making a call to it?
Oh, interesting!
Hmm. In the case of dllexporting, I wouldn't want to add any such custom logic in the linker (because it would need matching implementation in GNU ld etc, and the whole situation is quite nontrivial anyway), so I would prefer to stick to generating the relevant attributes where necessary. For potential references of doing this, can we make a slightly less trivial example, where the module does have dynamic initialization, which should trigger MSVC to produce a similar module-initializer? Then we could check how MSVC handles dllexport in this part of object file interfaces? |
The spec only allows to omit the call but not to omit the definition.
Yes, here is the example https://github.com/huangqinjin/cxxmodules/tree/master/shared-lib. The initialization of object For MSVC, nothing new for modules, just use the CRT initialization machenism, and insert the initializer of object > nmake -f Makefile.msvc clean hello.exe
> dumpbin /section:.CRT$XCU /RELOCATIONS hello.obj
SECTION HEADER #26
.CRT$XCU name
0 physical address
0 virtual address
8 size of raw data
1C17 file pointer to raw data (00001C17 to 00001C1E)
1C1F file pointer to relocation table
0 file pointer to line numbers
1 number of relocations
0 number of line numbers
40400040 flags
Initialized Data
8 byte align
Read Only
RELOCATIONS #26
Symbol Symbol
Offset Type Applied To Index Name
-------- ---------------- ----------------- -------- ------
00000000 ADDR64 00000000 00000000 3B ??__Egploc@@YAXXZ (void __cdecl `dynamic initializer for 'gploc''(void))
For MinGW, I launch the example with WinDBG and put a breakpoint at the module-initializer First time hit in
Second time hit in
Third time hit in
The first time call to the module-initialzer is always during DLL loading, otherwise if we dynamically load the DLL, those static objects would become unintialized. So that means, calls to the module-initialzer are unnecessary at all outside the DLL, these calls will never be the first. The GCC visibility bug no longer matters if the call to the module-initialzer can be eliminated. Looks like this should be done in the linker? At compile stage the compiler doesn't know whether the module-initialzer comes from a DLL or not. |
Oh, I had missed the fact that GCC did emit the definition, but just skipped the call. Then this area is indeed completely clear, sorry for the confusion.
Ok, so this is similar to the call in
What you say here does make sense. So what's the reason that the Itanium ABI added the extra module-initializer call? Or is the case that the module-initializer is called via regular static initializer an extra implementation bonus in Clang/GCC that Itanium ABI doesn't mandate?
Yes, maybe... But ideally I would like to see some prior art in ELF land, that such module-initializer calls can be eliminated, before I'd venture out to implement that in a linker. (This would need to be implemented in lld/COFF, lld/ELF, ld.bfd, ld.gold, etc.) Because, as you say, that GCC issue can be fixed by linkers eliminating those calls (or more practically, redirecting those calls to a no-op function that just returns). Other than that, I do agree with specifically one comment in the GCC bug report, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105397#c1:
For Windows, this would be the same as my idea above - if anything within the module interface is marked dllexport, then the initializer should also be marked dllexport. |
I guess with module initializers, we can guarantee that imported objects have been initialized, so we can access them safely even in namespace scope. |
Steps to Reproduce
hello.cppm
main.cpp
commands
-DX=1
results in ld.lld error,-DX=2
and-DX=3
have no error.-DX=1
with windows version of Clang also has no error.The text was updated successfully, but these errors were encountered: