Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine the total buffer size before starting to read LH5 data from a list of files #93

Open
gipert opened this issue May 11, 2024 · 1 comment · May be fixed by #109
Open

Determine the total buffer size before starting to read LH5 data from a list of files #93

gipert opened this issue May 11, 2024 · 1 comment · May be fixed by #109
Labels
lh5 HDF5 I/O performance Code performance

Comments

@gipert
Copy link
Member

gipert commented May 11, 2024

Should we avoid resizing buffers every time a new file is read-appended and instead already allocate a buffer of the right total size? Would this approach be less memory hungry and faster?

@gipert gipert added performance Code performance lh5 HDF5 I/O labels May 11, 2024
@iguinn
Copy link
Contributor

iguinn commented Jul 20, 2024

I think this would help with both memory and speed, but I'm not sure how we get the total size first without slowing things down (at least with core.read and store.read). I think this is a great idea for the data loader, as we could store sizes in the file DB.

Another option would be to allow the memory buffers for LGDO objects to be larger than the size. Then instead of increasing the size every time we add a new file, we could, say, double the size when the buffer is full. In other words, we could turn our arrays into C++ vectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lh5 HDF5 I/O performance Code performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants