-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729
base: main
Are you sure you want to change the base?
[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729
Conversation
Created using spr 1.3.5-bogner
@llvm/pr-subscribers-lld-macho @llvm/pr-subscribers-lld Author: Fangrui Song (MaskRay) ChangesxxHash, inferior to xxh3, is discouraged. We try not to use xxhash in Switch to read32le for content hash and xxh3/stable_hash_combine for Full diff: https://github.com/llvm/llvm-project/pull/121729.diff 2 Files Affected:
diff --git a/lld/MachO/BPSectionOrderer.h b/lld/MachO/BPSectionOrderer.h
index 8ba911fcc546bd..3de815d79b0f4c 100644
--- a/lld/MachO/BPSectionOrderer.h
+++ b/lld/MachO/BPSectionOrderer.h
@@ -19,7 +19,10 @@
#include "Symbols.h"
#include "lld/Common/BPSectionOrdererBase.h"
#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StableHashing.h"
#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/xxhash.h"
namespace lld::macho {
@@ -91,22 +94,20 @@ class BPSectionMacho : public BPSectionBase {
constexpr unsigned windowSize = 4;
// Calculate content hashes
- size_t dataSize = isec->data.size();
- for (size_t i = 0; i < dataSize; i++) {
- auto window = isec->data.drop_front(i).take_front(windowSize);
- hashes.push_back(xxHash64(window));
- }
+ ArrayRef<uint8_t> data = isec->data;
+ for (size_t i = 0; i <= data.size() - windowSize; i++)
+ hashes.push_back(llvm::support::endian::read32le(data.data() + i));
// Calculate relocation hashes
for (const auto &r : isec->relocs) {
- if (r.length == 0 || r.referent.isNull() || r.offset >= isec->data.size())
+ if (r.length == 0 || r.referent.isNull() || r.offset >= data.size())
continue;
uint64_t relocHash = getRelocHash(r, sectionToIdx);
uint32_t start = (r.offset < windowSize) ? 0 : r.offset - windowSize + 1;
for (uint32_t i = start; i < r.offset + r.length; i++) {
- auto window = isec->data.drop_front(i).take_front(windowSize);
- hashes.push_back(xxHash64(window) + relocHash);
+ auto window = data.drop_front(i).take_front(windowSize);
+ hashes.push_back(xxh3_64bits(window) ^ relocHash);
}
}
@@ -124,19 +125,17 @@ class BPSectionMacho : public BPSectionBase {
std::optional<uint64_t> sectionIdx;
if (auto it = sectionToIdx.find(isec); it != sectionToIdx.end())
sectionIdx = it->second;
- std::string kind;
+ uint64_t kind = -1, value = 0;
if (isec)
- kind = ("Section " + Twine(isec->kind())).str();
+ kind = uint64_t(isec->kind());
if (auto *sym = reloc.referent.dyn_cast<Symbol *>()) {
- kind += (" Symbol " + Twine(sym->kind())).str();
- if (auto *d = llvm::dyn_cast<Defined>(sym)) {
- return BPSectionBase::getRelocHash(kind, sectionIdx.value_or(0),
- d->value, reloc.addend);
- }
+ kind = (kind << 8) | uint8_t(sym->kind());
+ if (auto *d = llvm::dyn_cast<Defined>(sym))
+ value = d->value;
}
- return BPSectionBase::getRelocHash(kind, sectionIdx.value_or(0), 0,
- reloc.addend);
+ return llvm::stable_hash_combine(kind, sectionIdx.value_or(0), value,
+ reloc.addend);
}
};
diff --git a/lld/include/lld/Common/BPSectionOrdererBase.h b/lld/include/lld/Common/BPSectionOrdererBase.h
index e2cb41f69cc684..29599afa03bd40 100644
--- a/lld/include/lld/Common/BPSectionOrdererBase.h
+++ b/lld/include/lld/Common/BPSectionOrdererBase.h
@@ -18,7 +18,6 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"
-#include "llvm/Support/xxhash.h"
#include <memory>
#include <optional>
@@ -56,14 +55,6 @@ class BPSectionBase {
return P1;
}
- static uint64_t getRelocHash(llvm::StringRef kind, uint64_t sectionIdx,
- uint64_t offset, uint64_t addend) {
- return llvm::xxHash64((kind + ": " + llvm::Twine::utohexstr(sectionIdx) +
- " + " + llvm::Twine::utohexstr(offset) + " + " +
- llvm::Twine::utohexstr(addend))
- .str());
- }
-
/// Reorders sections using balanced partitioning algorithm based on profile
/// data.
static llvm::DenseMap<const BPSectionBase *, size_t>
|
lld/MachO/BPSectionOrderer.h
Outdated
hashes.push_back(xxHash64(window)); | ||
} | ||
ArrayRef<uint8_t> data = isec->data; | ||
for (size_t i = 0; i <= data.size() - windowSize; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sections shorter than 4 bytes are trivial functions that are likely
foled by ICF.
We should still guard against the case when data.size() < windowSize
because these sections could be data sections in the future
lld/MachO/BPSectionOrderer.h
Outdated
} | ||
ArrayRef<uint8_t> data = isec->data; | ||
for (size_t i = 0; i <= data.size() - windowSize; i++) | ||
hashes.push_back(llvm::support::endian::read32le(data.data() + i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes how we take hashes at the end of the section, but it could be a change for the better. I'm testing this PR on our apps to see if there is a size regression or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only very minor size changes on my end. I'm curious if @Colibrow sees and size regressions. LGTM after fixing the data.size() < windowSize
issue.
Created using spr 1.3.5-bogner
Thanks. I've changed the tail hashing scheme to:
This helps group 0102 and 0201 together. The similarity between 030201 and 0201 is now the same as 030201 and 0102, which should be fine. |
ArrayRef<uint8_t> data = isec->data; | ||
if (data.size() >= windowSize) | ||
for (size_t i = 0; i <= data.size() - windowSize; ++i) | ||
hashes.push_back(llvm::support::endian::read32le(data.data() + i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using read32le()
here and xxh3_64bits()
for relocations below? As I understand, read32le()
only works here because the window size is exactly 4. I chose this window size because it gave the best results on a few binaries, but other window sizes could work better for other scenarios. If we use xxh3_64bits()
in both cases, we are free to change windowSize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is weird. xxh3_64bits(window)
for relocation hashing is just because it's easy: no need to handle the shorter-than-4-bytes case.
Hmmm. Reloc::length
is actually a logarithm field. For Mach-O arm64, the relocation offsets are aligned to start of the instruction. Shall we compute one single hash for a relocation? I guess the sliding window doesn't help, but happy to be proven wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file | size | gzipped size |
---|---|---|
libsample.so | 3,181,560 | 1,487,043 |
libsample.so_xxh3 | 3,181,544 | 1,486,208 |
It seems good although the change is minor. The object file size change confused me. Do you have any ideas?
xxHash, inferior to xxh3, is discouraged. We try not to use xxhash in
lld.
Switch to read32le for content hash and xxh3/stable_hash_combine for
relocation hash. Remove the intermediate std::string for relocation
hash.
Change the tail hashing scheme to consider individual bytes instead.
This helps group 0102 and 0201 together. The benefit is negligible,
though.