Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helper for iterating lenght-delimited from a Read #157

Open
vorner opened this issue Mar 2, 2019 · 8 comments
Open

Helper for iterating lenght-delimited from a Read #157

vorner opened this issue Mar 2, 2019 · 8 comments

Comments

@vorner
Copy link
Contributor

vorner commented Mar 2, 2019

Hello

I know prost doesn't build its base abstraction on top of Read, but on Buf. However, if I have a huge file of length-delimited messages, I'm in a quite tricky situation. The options I see:

  • Either slurp the hole file into memory and then repeatedly call Message::decode_length_delimited(&mut buffer) until the whole buffer is consumed. This is convenient, but has the downside of putting the whole file into memory at once.
  • Manually juggle refilling the buffer with at least 10 bytes (but somehow handling the EOF here), calling decode_length_delimiter, then refilling with as many bytes and calling Message::decode. That seems possible, but a lot of work and lot of space for errors.

So I was thinking that if I'm going to write the latter, it would make sense to share it with others (and get another pair of eyes to review the errors I'll make 😇). Before I dive in, I have few questions, though:

  • Do you see another better way I'm missing?
  • Where into the library should I put it and with what interface? I was thinking about something in the lines of read_length_delimited<R: Read, M: Msg>(read: R) -> impl Iterator<Result<M, SomeError { IoError | DecodeError }>>, but I'm open to other suggestions.
  • Should it come with the write_length_delimited counter-part?

Thank you

@danburkert
Copy link
Collaborator

Do you have control over the format? If so I'd recommend using a fixed width delimiter, it makes it a bit easier. You can see an example of using std::io::Read with fixed length delimiters in the conformance test runner. With variable width delimiters reading the length tag becomes a bit more tedious, but it's definitely doable.

@vorner
Copy link
Contributor Author

vorner commented Mar 2, 2019

Do you have control over the format?

Not really. I already have some files with data.

Yes, I'm pretty sure I can make it work and the „tedious“ is about as good explanation as any of how I envision the implementation will look like.

My question was more in the sense, is it OK to put the code into prost once I write it? If so, do you have any preferences on the interface, naming and such?

@vorner
Copy link
Contributor Author

vorner commented Apr 25, 2019

@danburkert Could I ask you for the opinion? Can you give a very fast skim over the draft linked above?

I want to polish the think eventually. But, do you want this in the prost crate proper, or should I just spin up some prost-io crate of my own?

Thanks

@tiziano88
Copy link

Sorry to resurrect this old conversation, I am trying to do something similar, but I don't think I can even get it to work with the Buf approach; AFAICT https://docs.rs/prost/0.7.0/prost/trait.Message.html#method.decode_length_delimited consumes the entire buffer at once, so how would I be able to read multiple records from the buffer?

@vorner
Copy link
Contributor Author

vorner commented Mar 30, 2021

You can pass &mut T as the buffer and regain the ownership (https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T). You can also limit the reader to take only what's necessary (https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#method.take).

Alternatively, you can pass slices (they also implement Buf).

@tiziano88
Copy link

Ah, I missed the https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T impl, thanks for pointing out! All good now :)

@hvina
Copy link

hvina commented Apr 12, 2021

Ah, I missed the https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T impl, thanks for pointing out! All good now :)

@tiziano88 @vorner Can you possibly attach a code sample of doing that . I am struggling with that . I want to loop through the message that is read . this is what I have

Thanks a lot and sorry for piling on to a thread that was meant for a different purpose

pub mod dnstap {
    include!(concat!(env!("OUT_DIR"), "/dnstap.rs"));
}

fn main() -> Result<(), Box> {
    println!("Hello, world!");
    let mut f = File::open("dnstap.log")?;
    let mut buffer = Vec::new();
    // read the whole file
    f.read_to_end(&mut buffer)?;
    let mut cursor: Cursor> = Cursor::new(buffer);
    let message:dnstap::Message = prost::Message::decode_length_delimited(&mut cursor)?;
    println!("Finished Reading! {:?}",message);
    Ok(())
}

where the dnstap.rs is dnstap.rs is from dnstap.proto Github using the protobuild

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants