You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current version of the code seems to try to load the whole file into memory which fails for large files.
E.g. in my case I get the following error: memory allocation of 5404574964 bytes failed
It would be great if the code would read the file sequentially / piece by piece.
The text was updated successfully, but these errors were encountered:
A temporary solution to this and maybe also a nice new feature would to have an option that allows to parse only the first X bytes of a file and check those regarding the regex.
The traditional grep doesn't have this exact issue because the regexes are limited to lines, so it reads one line of the file at a time. With bgrep, we shouldn't have such restriction, because the binary pattern might have the equivalent of a line break character, even when not representing actual textual line breaks. If I recall correctly, the Regex crate has no support for arbitrary buffered reading/matching, and that's why bgrep reads the entire file into memory. I believe the only feasible alternative for very large files is using memory maps (I know ripgrep can do this), and I'm willing to support implementing such feature, even though it would be non-trivial, probably requiring some unsafe code.
For handling only the first X bytes of a file, one can combine the head command and bgrep with a pipe:
The current version of the code seems to try to load the whole file into memory which fails for large files.
E.g. in my case I get the following error:
memory allocation of 5404574964 bytes failed
It would be great if the code would read the file sequentially / piece by piece.
The text was updated successfully, but these errors were encountered: