[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729

MaskRay · 2025-01-06T04:18:01Z

xxHash, inferior to xxh3, is discouraged. We try not to use xxhash in
lld.

Switch to read32le for content hash and xxh3/stable_hash_combine for
relocation hash. Remove the intermediate std::string for relocation
hash.

Change the tail hashing scheme to consider individual bytes instead.
This helps group 0102 and 0201 together. The benefit is negligible,
though.

Created using spr 1.3.5-bogner

llvmbot · 2025-01-06T04:18:40Z

@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-lld

Author: Fangrui Song (MaskRay)

Changes

xxHash, inferior to xxh3, is discouraged. We try not to use xxhash in
lld.

Switch to read32le for content hash and xxh3/stable_hash_combine for
relocation hash. Remove the intermediate std::string for relocation
hash.

Full diff: https://github.com/llvm/llvm-project/pull/121729.diff

2 Files Affected:

(modified) lld/MachO/BPSectionOrderer.h (+16-17)
(modified) lld/include/lld/Common/BPSectionOrdererBase.h (-9)

diff --git a/lld/MachO/BPSectionOrderer.h b/lld/MachO/BPSectionOrderer.h
index 8ba911fcc546bd..3de815d79b0f4c 100644
--- a/lld/MachO/BPSectionOrderer.h
+++ b/lld/MachO/BPSectionOrderer.h
@@ -19,7 +19,10 @@
 #include "Symbols.h"
 #include "lld/Common/BPSectionOrdererBase.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StableHashing.h"
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/xxhash.h"
 
 namespace lld::macho {
 
@@ -91,22 +94,20 @@ class BPSectionMacho : public BPSectionBase {
     constexpr unsigned windowSize = 4;
 
     // Calculate content hashes
-    size_t dataSize = isec->data.size();
-    for (size_t i = 0; i < dataSize; i++) {
-      auto window = isec->data.drop_front(i).take_front(windowSize);
-      hashes.push_back(xxHash64(window));
-    }
+    ArrayRef<uint8_t> data = isec->data;
+    for (size_t i = 0; i <= data.size() - windowSize; i++)
+      hashes.push_back(llvm::support::endian::read32le(data.data() + i));
 
     // Calculate relocation hashes
     for (const auto &r : isec->relocs) {
-      if (r.length == 0 || r.referent.isNull() || r.offset >= isec->data.size())
+      if (r.length == 0 || r.referent.isNull() || r.offset >= data.size())
         continue;
 
       uint64_t relocHash = getRelocHash(r, sectionToIdx);
       uint32_t start = (r.offset < windowSize) ? 0 : r.offset - windowSize + 1;
       for (uint32_t i = start; i < r.offset + r.length; i++) {
-        auto window = isec->data.drop_front(i).take_front(windowSize);
-        hashes.push_back(xxHash64(window) + relocHash);
+        auto window = data.drop_front(i).take_front(windowSize);
+        hashes.push_back(xxh3_64bits(window) ^ relocHash);
       }
     }
 
@@ -124,19 +125,17 @@ class BPSectionMacho : public BPSectionBase {
     std::optional<uint64_t> sectionIdx;
     if (auto it = sectionToIdx.find(isec); it != sectionToIdx.end())
       sectionIdx = it->second;
-    std::string kind;
+    uint64_t kind = -1, value = 0;
     if (isec)
-      kind = ("Section " + Twine(isec->kind())).str();
+      kind = uint64_t(isec->kind());
 
     if (auto *sym = reloc.referent.dyn_cast<Symbol *>()) {
-      kind += (" Symbol " + Twine(sym->kind())).str();
-      if (auto *d = llvm::dyn_cast<Defined>(sym)) {
-        return BPSectionBase::getRelocHash(kind, sectionIdx.value_or(0),
-                                           d->value, reloc.addend);
-      }
+      kind = (kind << 8) | uint8_t(sym->kind());
+      if (auto *d = llvm::dyn_cast<Defined>(sym))
+        value = d->value;
     }
-    return BPSectionBase::getRelocHash(kind, sectionIdx.value_or(0), 0,
-                                       reloc.addend);
+    return llvm::stable_hash_combine(kind, sectionIdx.value_or(0), value,
+                                     reloc.addend);
   }
 };
 
diff --git a/lld/include/lld/Common/BPSectionOrdererBase.h b/lld/include/lld/Common/BPSectionOrdererBase.h
index e2cb41f69cc684..29599afa03bd40 100644
--- a/lld/include/lld/Common/BPSectionOrdererBase.h
+++ b/lld/include/lld/Common/BPSectionOrdererBase.h
@@ -18,7 +18,6 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
-#include "llvm/Support/xxhash.h"
 #include <memory>
 #include <optional>
 
@@ -56,14 +55,6 @@ class BPSectionBase {
     return P1;
   }
 
-  static uint64_t getRelocHash(llvm::StringRef kind, uint64_t sectionIdx,
-                               uint64_t offset, uint64_t addend) {
-    return llvm::xxHash64((kind + ": " + llvm::Twine::utohexstr(sectionIdx) +
-                           " + " + llvm::Twine::utohexstr(offset) + " + " +
-                           llvm::Twine::utohexstr(addend))
-                              .str());
-  }
-
   /// Reorders sections using balanced partitioning algorithm based on profile
   /// data.
   static llvm::DenseMap<const BPSectionBase *, size_t>

ellishg · 2025-01-06T21:41:45Z

lld/MachO/BPSectionOrderer.h

-      hashes.push_back(xxHash64(window));
-    }
+    ArrayRef<uint8_t> data = isec->data;
+    for (size_t i = 0; i <= data.size() - windowSize; i++)


Sections shorter than 4 bytes are trivial functions that are likely
foled by ICF.

We should still guard against the case when data.size() < windowSize because these sections could be data sections in the future

ellishg · 2025-01-06T22:49:18Z

lld/MachO/BPSectionOrderer.h

-    }
+    ArrayRef<uint8_t> data = isec->data;
+    for (size_t i = 0; i <= data.size() - windowSize; i++)
+      hashes.push_back(llvm::support::endian::read32le(data.data() + i));


This changes how we take hashes at the end of the section, but it could be a change for the better. I'm testing this PR on our apps to see if there is a size regression or not

ellishg

Only very minor size changes on my end. I'm curious if @Colibrow sees and size regressions. LGTM after fixing the data.size() < windowSize issue.

Created using spr 1.3.5-bogner

MaskRay · 2025-01-07T04:13:04Z

Only very minor size changes on my end. I'm curious if @Colibrow sees and size regressions. LGTM after fixing the data.size() < windowSize issue.

Thanks. I've changed the tail hashing scheme to:

    if (data.size() >= windowSize)
      for (size_t i = 0; i <= data.size() - windowSize; ++i)
        hashes.push_back(llvm::support::endian::read32le(data.data() + i));
    for (uint8_t byte : data.take_back(windowSize - 1))
      hashes.push_back(byte);

This helps group 0102 and 0201 together. The similarity between 030201 and 0201 is now the same as 030201 and 0102, which should be fine.

ellishg · 2025-01-07T04:36:08Z

lld/MachO/BPSectionOrderer.h

+    ArrayRef<uint8_t> data = isec->data;
+    if (data.size() >= windowSize)
+      for (size_t i = 0; i <= data.size() - windowSize; ++i)
+        hashes.push_back(llvm::support::endian::read32le(data.data() + i));


Why are we using read32le() here and xxh3_64bits() for relocations below? As I understand, read32le() only works here because the window size is exactly 4. I chose this window size because it gave the best results on a few binaries, but other window sizes could work better for other scenarios. If we use xxh3_64bits() in both cases, we are free to change windowSize.

I agree that this is weird. xxh3_64bits(window) for relocation hashing is just because it's easy: no need to handle the shorter-than-4-bytes case.

Hmmm. Reloc::length is actually a logarithm field. For Mach-O arm64, the relocation offsets are aligned to start of the instruction. Shall we compute one single hash for a relocation? I guess the sliding window doesn't help, but happy to be proven wrong.

file size gzipped size

libsample.so 3,181,560 1,487,043

libsample.so_xxh3 3,181,544 1,486,208

It seems good although the change is minor. The object file size change confused me. Do you have any ideas?

[𝘀𝗽𝗿] initial version

671e163

Created using spr 1.3.5-bogner

MaskRay requested a review from ellishg January 6, 2025 04:18

llvmbot added lld lld:MachO labels Jan 6, 2025

MaskRay requested a review from kyulee-com January 6, 2025 04:18

ellishg reviewed Jan 6, 2025

View reviewed changes

ellishg reviewed Jan 7, 2025

View reviewed changes

MaskRay added 2 commits January 6, 2025 20:09

change tail hashing scheme

3d4e558

Created using spr 1.3.5-bogner

.

86bd9d9

Created using spr 1.3.5-bogner

ellishg reviewed Jan 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729

[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729

MaskRay commented Jan 6, 2025 •

edited

Loading

llvmbot commented Jan 6, 2025 •

edited

Loading

ellishg Jan 6, 2025

ellishg Jan 6, 2025

ellishg left a comment

MaskRay commented Jan 7, 2025

ellishg Jan 7, 2025

MaskRay Jan 7, 2025 •

edited

Loading

Colibrow Jan 7, 2025 •

edited

Loading

file	size	gzipped size
libsample.so	3,181,560	1,487,043
libsample.so_xxh3	3,181,544	1,486,208

[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729

Are you sure you want to change the base?

[lld-macho,BalancedPartition] Simplify relocation hash and avoid xxHash #121729

Conversation

MaskRay commented Jan 6, 2025 • edited Loading

llvmbot commented Jan 6, 2025 • edited Loading

ellishg Jan 6, 2025

Choose a reason for hiding this comment

ellishg Jan 6, 2025

Choose a reason for hiding this comment

ellishg left a comment

Choose a reason for hiding this comment

MaskRay commented Jan 7, 2025

ellishg Jan 7, 2025

Choose a reason for hiding this comment

MaskRay Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

Colibrow Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

MaskRay commented Jan 6, 2025 •

edited

Loading

llvmbot commented Jan 6, 2025 •

edited

Loading

MaskRay Jan 7, 2025 •

edited

Loading

Colibrow Jan 7, 2025 •

edited

Loading