Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poolp org/blake3 hash #448

Merged
merged 6 commits into from
Feb 10, 2025
Merged

Poolp org/blake3 hash #448

merged 6 commits into from
Feb 10, 2025

Conversation

poolpOrg
Copy link
Collaborator

@poolpOrg poolpOrg commented Feb 8, 2025

propose BLAKE3 as an alternate hashing function

Copy link
Contributor

@omar-polo omar-polo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

cmd/plakar/subcommands/create/plakar-create.1 Outdated Show resolved Hide resolved
hashing/hashing.go Outdated Show resolved Hide resolved
@glycerine
Copy link

glycerine commented Feb 10, 2025

Instead of zeebo, consider https://github.com/glycerine/blake3, on which I just did a bunch of performance work. It is a light fork of the hardware accelerated https://github.com/lukechampine/blake3 -- to which I added multiple goroutine hashing in parallel, which gives massive speedups.

The zeebo blake3 is very up front about not doing multi-threading.

Measurements showing going 3x to 20x faster:

	// Ignoring disk, mac does best at 19 bits; 512KB
	// parallel segBits = 14  =>  117.470199ms  (3.21 x speedup)
	// parallel segBits = 15  =>  87.613051ms  (4.31 x speedup)
	// parallel segBits = 16  =>  73.345909ms  (5.15 x speedup)
	// parallel segBits = 17  =>  69.810899ms  (5.41 x speedup)
	// parallel segBits = 18  =>  66.965459ms  (5.64 x speedup)
	// parallel segBits = 19  =>  64.883706ms  (5.82 x speedup)
	// parallel segBits = 20  =>  68.4841ms  (5.51 x speedup)
	// parallel segBits = 21  =>  67.292666ms  (5.61 x speedup)
	// parallel segBits = 22  =>  74.68442ms  (5.05 x speedup)
	// parallel segBits = 23  =>  79.767926ms  (4.73 x speedup)
	// parallel segBits = 24  =>  74.956038ms  (5.04 x speedup)

	// Linux, 48 cores, also does best at 19 or 21 bits, so use 19.
	// parallel segBits = 14  =>  170.105834ms  (2.65 x speedup)
	// parallel segBits = 15  =>  88.193593ms  (5.10 x speedup)
	// parallel segBits = 16  =>  39.2206ms  (11.47 x speedup)
	// parallel segBits = 17  =>  25.221818ms  (17.84 x speedup)
	// parallel segBits = 18  =>  21.495614ms  (20.93 x speedup)
	// parallel segBits = 19  =>  21.194702ms  (21.23 x speedup)
	// parallel segBits = 20  =>  21.930639ms  (20.52 x speedup)
	// parallel segBits = 21  =>  21.125441ms  (21.30 x speedup)
	// parallel segBits = 22  =>  22.625609ms  (19.89 x speedup)
	// parallel segBits = 23  =>  24.649581ms  (18.25 x speedup)
	// parallel segBits = 24  =>  29.267046ms  (15.37 x speedup)
	// --- PASS: TestBigMerge (1.38s)

@poolpOrg poolpOrg merged commit e2f2153 into main Feb 10, 2025
4 checks passed
@poolpOrg poolpOrg deleted the poolpOrg/blake3-hash branch February 10, 2025 08:44
@poolpOrg
Copy link
Collaborator Author

we're going to discuss which lib to use, thanks for sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants