Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undesirable(?) implementation of DerefMut on BytesMut #757

Open
dowlingw opened this issue Jan 24, 2025 · 5 comments
Open

Undesirable(?) implementation of DerefMut on BytesMut #757

dowlingw opened this issue Jan 24, 2025 · 5 comments

Comments

@dowlingw
Copy link

dowlingw commented Jan 24, 2025

Note: I am a relatively new Rust user - so if the following is a skill issue please let me know 😅🙏

When BytesMut is implicitly coerced into a &mut [u8] through it's implementation of the DerefMut trait, callers may mutate the buffer but will not be able to advance the internal cursor.

In the 80% case, this is fine and will be caught during development time - whilst data will have been written to the underlying storage, the length will show as 0 and the developer will realise their mistake at runtime.

However this assumption does not always hold true - an empty buffer can still have valid semantics, especially in context of handling binary data. Below is an example of this deserialising protobuf messages using the prost library (noting that prost::bytes::BytesMut is a re-export of this crates types) where this behaviour causes an issue.

It seems like a footgun to allow users to implicitly and automatically allow coercion - this removes the usefulness of the type system to steer engineers to the correct behaviour. Especially so given we already implement the BorrowMut requiring an explicit call.

Without deep understanding of how this crate is used I would suggest:

  1. Document this limitation and steer developers to using BorrowMut
  2. Consider future removal of the DerefMut trait for this type as a breaking change
use prost::bytes::BytesMut;
use prost::Message;
use generated::ClientMessage;
use std::io;
use std::net::UdpSocket;
use tokio::net::UdpSocket;

#[tokio::main]
async fn main() -> io::Result<()> {
    let sock = UdpSocket::bind("0.0.0.0:1337").await?;

    let mut buf = BytesMut::default();
    // let (len, addr) = sock.recv_from(&mut buf).await?;       // Incorrect
    // let (len, addr) = sock.recv_buf_from(&mut buf).await?;   // Correct

    if let Some(msg) = ClientMessage::decode(&mut buf) {
        // ...
        // Both versions of the code compile and reach this block
        // When running the incorrect version - all message fields are empty
        //
        // Because:
        //  `BytesMut` implements `DerefMut<[u8]>` (allowing implicit type coercion)
        //
        //  `recv_from` takes `&mut [u8]` (or anything that can be coerced into that shape)
        //  `recv_from` doesn't/can't set the length because it operates on the buffer directly instead of via the API
        //
        //  In proto3 a zero length payload is valid because:
        //    All fields are optional
        //    Only fields that are present are serialised 
        //
        // ...
    }
}
@dowlingw
Copy link
Author

dowlingw commented Jan 24, 2025

Note: There is an existing issue around this trait/type (#387) but I do not believe this is describing the same issue arising from this.

@Darksonn
Copy link
Contributor

FYI, we just opened a call for breaking changes.

@dowlingw
Copy link
Author

I think this should be possible to implement using the semver trick.

  • v2 would remove its implementation of the DerefMut trait

  • v1 would use the newtype pattern to implement the DerefMut trait using the existing BorrowMut method

I'll have another look on the breaking changes issue to see how things are being coordinated, and if I have time would be happy to try raise PRs for both parts.

@geeklint
Copy link

The semver trick works explicitly on re-exports, it works because the compiler sees that bytesv1::BytesMut is the same type as bytesv2::BytesMut, so a newtype wouldn't be a solution.

@Kixunil
Copy link

Kixunil commented Jan 25, 2025

It's impossible to remove the impl without breakage and it happens that there's a similar issue in std: read/read_exact vs read_to_end when used with &mut Vec<u8>. I do think you have a point but at this point we might as well say it's a skill issue. Perhaps if compiler diagnostics could be improved somehow to catch these but we also need legitimate uses to not warn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants