Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: update DetectContentType sniff signature for MP3 without ID3 #21124

Open
odeke-em opened this issue Jul 22, 2017 · 9 comments
Open
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@odeke-em
Copy link
Member

This is a placeholder bug to remind me to look into updating the http.DetectContentType sniff mechanisms to accomodate an addition to whatwg/mimesniff in which MP3s can be sniffed without an ID3 tag https://mimesniff.spec.whatwg.org/#signature-for-mp3-without-id3.

This shall require updating the max buffer size from 512 to 1445 and reviewing that algorithm described in https://mimesniff.spec.whatwg.org/#signature-for-mp3-without-id3. I noticed this after seeing this tweet https://twitter.com/mimesniff/status/888665270025420800 which pointed me to PR whatwg/mimesniff#4.

This can be discussed when the tree opens up for Go1.10.

@odeke-em odeke-em self-assigned this Jul 22, 2017
@odeke-em odeke-em added this to the Go1.10 milestone Jul 22, 2017
@earthboundkid
Copy link
Contributor

Can there be an exported constant for the buffer size that DetectContentType expects if its going to change? Right now I just hardcode sending 512 bytes to it when I use it.

@bradleyfalzon
Copy link

Further to this, I have a local change to implement whatwg/mimesniff#27 and whatwg/mimesniff#28 but I was waiting until upstream clarified their positions. Not sure if there's value implementing what's currently clarified already though (eg implement application/vnd.ms-fontobject and application/font-woff).

@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 27, 2017
@bradfitz bradfitz added FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsFix The path to resolution is known, but the work has not been done. labels Nov 27, 2017
@agnivade
Copy link
Contributor

@odeke-em - Just checking if this is in your radar ? Or I can spend some time with this and whip up a CL.

@odeke-em
Copy link
Member Author

@agnivade thanks for the ping, please go ahead, it is all yours :) Just make sure
that the content is from mimesniff.spec.whatwg.org which is our resource for content type sniffing.

@agnivade
Copy link
Contributor

Great, thanks.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/101375 mentions this issue: net/http: sniffing mp3 content without ID3

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 May 4, 2018
@agnivade
Copy link
Contributor

Just wanted to mention that the algorithm in mimesniff.spec.whatwg.org has slight mistakes which is why I was unable to make it work in my CL. As confirmed by this issue - whatwg/mimesniff#70.

@odeke-em
Copy link
Member Author

Thank you @agnivade for tackling this! The whatwg/mimesniff spec hasn't yet been fixed and I don't think that will happen during this cycle, so perhaps punting this to Unplanned and I'll also assign it to you.

@odeke-em odeke-em assigned agnivade and unassigned odeke-em Nov 29, 2018
@odeke-em odeke-em modified the milestones: Go1.12, Unplanned Nov 29, 2018
@c12h
Copy link

c12h commented Dec 2, 2018

There is a deeper problem here. WHATWG's MIME sniffing spec is designed for use in browsers, and only in browsers. This is actually explained in the Introduction; I guess I'm not the only person who failed to notice this the first few times I read the spec. The key sentence is in the last paragraph:

This document describes a content sniffing algorithm that carefully balances the compatibility needs of user agent[s] with the security constraints imposed by existing web content.

So, for example, when DetectContentType classifies a text file containing “GIF87a is over 30 years old” as "image/gif", it is implementing the MIME spec correctly. The problem is that the WHATWG's algorithm is not appropriate outside the narrow context of a web browser handling downloaded files securely. Sigh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants