From aa2980ff02fe0b938859ca5c2566c27417447be1 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Thu, 27 Oct 2022 11:49:12 +0200 Subject: [PATCH 1/5] Define opaque-response blocking This is good enough for early review, but there are a number of issues that still need resolving: https://github.com/annevk/orb/labels/mvp. There are also some inline TODO comments. A PR against HTML is needed to ensure it passes the appropriate metadata for media element and classic script requests. We might also want to depend on HTML for parsing JavaScript. --- fetch.bs | 313 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 305 insertions(+), 8 deletions(-) diff --git a/fetch.bs b/fetch.bs index 18ff28f7c..69706634c 100644 --- a/fetch.bs +++ b/fetch.bs @@ -52,7 +52,9 @@ urlPrefix:https://w3c.github.io/hr-time/#;spec:hr-time urlPrefix:https://tc39.es/ecma262/#;type:dfn;spec:ecma-262 url:realm;text:realm url:sec-list-and-record-specification-type;text:Record - url:current-realm;text:current realm + url:sec-parsetext;text:ParseText + url:prod-Script;text:Script + url:script-record;text:Script Record
@@ -2161,6 +2163,17 @@ Unless stated otherwise, it is false.
 
 

This flag is for exclusive use by HTML's render-blocking mechanism. [[!HTML]] +

A request has an associated +no-cors media request state ... + +

This is for exclusive use by the opaque-response-safelist check. + +

A request has an associated +no-cors JavaScript fallback encoding (an encoding). Unless +stated otherwise, it is UTF-8. + +

This is for exclusive use by the opaque-response-safelist check. +


A request has an associated @@ -3275,6 +3288,285 @@ through TLS using ALPN. The protocol cannot be spoofed through HTTP requests in +

Opaque-response blocking

+ +
+

Opaque-response blocking, also known as ORB, is a network filter that blocks access + to opaque filtered responses. These responses would likely would not have been useful to the + fetching party. Blocking them reduces information leakage to potential attackers. + +

Essentially, CSS, JavaScript, images, and media (audio and video) can be requested across + origins without the CORS protocol. And unfortunately except for CSS there is no MIME type + enforcement. This algorithm aims to block as many responses as possible that are not one of these + types (or are newer variants of those types) to avoid leaking their contents through side channels. + +

The network filter combines pro-active blocking based on response headers, sniffing a limited + set of bytes, and ultimately falls back to a full parse due to unfortunate (lack of) design + decisions in the early days of the web platform. As a result there are still quite a few responses + whose secrets can end up being revealed to attackers. Web developers are strongly encouraged to use + the `Cross-Origin-Resource-Policy` response header to defend them. +

+ + +

The opaque-response-safelist check

+ +

The opaque-response-safelist check, given a request request +and a response response, is to run these steps: + +

    +
  1. Let mimeType be the result of extracting a MIME type from + response's header list. + +

  2. Let nosniff be the result of determining nosniff given + response's header list. + +

  3. +

    If mimeType is not failure, then: + +

      +
    1. If mimeType is an opaque-response-safelisted MIME type, then return + true. + +

    2. If mimeType is an opaque-response-blocklisted-never-sniffed MIME type, + then return false. + +

    3. If response's status is 206 and mimeType is an + opaque-response-blocklisted MIME type, then return false. + +

    4. If nosniff is true and mimeType is an + opaque-response-blocklisted MIME type or its essence is + "text/plain", then return false. +

    + +
  4. If request's no-cors media request state is + "subsequent", then return true. + +

  5. If response's status is 206 and + validate a partial response given 0 and response returns invalid, then return + false. + + +

  6. Let bytes be the result of running + obtain a copy of the first 1024 bytes of response given response. + +

  7. If bytes is failure, then return false. + +

  8. +

    If the audio or video type pattern matching algorithm given bytes does not + return undefined, then: + +

      +
    1. If requests's no-cors media request state is not + "initial", then return false. + +

    2. If response's status is not 200 or 206, then return false. + +

    3. Return true. +

    + +
  9. If requests's no-cors media request state is not + "N/A", then return false. + +

  10. If the image type pattern matching algorithm given bytes does not return + undefined, then return true. + +

  11. +

    If nosniff is true, then return false. + +

    This check is made late as unfortunately images and media are always sniffed. + +

  12. If response's status is not an ok status, then return + false. + +

  13. +

    If mimeType is failure, then return true. + +

    This could be improved at somewhat significant cost. See + annevk/orb #28. + +

  14. If mimeType's essence starts with + "audio/", "image/", or "video/", then return false. + +

  15. Return determine if response is JavaScript and not JSON given response. +

+ +
+ +

To obtain a copy of the first 1024 bytes of response, given a response +response, run these steps: + +

    +
  1. Let first1024Bytes be null. + +

  2. +

    In parallel: + +

      +
    1. Let bytes be the empty byte sequence. + +

    2. Let transformStream be a new {{TransformStream}}. + +

    3. +

      Let transformAlgorithm given a chunk be these steps: + +

        +
      1. Enqueue chunk in transformStream. + +

      2. +

        If first1024Bytes is null, then: + +

          +
        1. Let chunkBytes be + a copy of the bytes held by + chunk. + +

        2. Append chunkBytes to bytes. + +

        3. +

          If bytes's length is greater than 1024, then: + +

            +
          1. Truncate bytes from the end so that it only contains 1024 bytes. + +

          2. Set first1024Bytes to bytes. +

          +
        +
      + +
    4. Let flushAlgorithm be this step: if first1024Bytes is null, then set + first1024Bytes to bytes. + +

    5. Set up transformStream with + transformAlgorithm set to + transformAlgorithm and flushAlgorithm set + to flushAlgorithm. + +

    6. Set response's body's stream to the result + of response's body's stream + piped through transformStream. +

    + +
  3. Wait until first1024Bytes is non-null or response's + body's stream is errored. + +

  4. If first1024Bytes is null, then return failure. + +

  5. Return first1024Bytes. +
+ +
+ +

To determine if response is JavaScript and not JSON given a response +response, run these steps:

+ +
    +
  1. Let responseBodyBytes be null. + +

  2. +

    Let processBody given a byte sequence bytes be these steps: + +

      +
    1. Set responseBodyBytes to bytes. + +

    2. Set response's body to the body + of the result of safely extracting bytes. +

    + +
  3. Let processBodyError be this step: set responseBodyBytes to failure. + +

  4. Fully read response's body given processBody + and processBodyError. + +

  5. Wait for responseBodyBytes to be non-null. + +

  6. If responseBodyBytes is failure, then return false. + +

  7. Assert: responseBodyBytes is a byte sequence. + +

  8. +

    If parse JSON bytes to a JavaScript value given responseBodyBytes does not + throw, then return false. If it throws, catch the exception and ignore it. + +

    If there is an exception, response is not JSON. If there is not, it is. + +

  9. Let potentialMIMETypeForEncoding be the result of extracting a MIME type + given response's header list. + +

  10. +

    Let encoding be the result of legacy extracting an encoding given + potentialMIMETypeForEncoding and request's + no-cors JavaScript fallback encoding. + +

    Equivalently to fetch a classic script, this ignores the + MIME type essence. + +

  11. Let sourceText be the result of decoding + responseBodyBytes given encoding. + +

  12. If ParseText(sourceText, Script) returns a Script Record, + then return true. + + +

  13. Return false. +

+ + +

New MIME type sets

+ +

The definitions in this section are solely for the purpose of abstracting parts of the +opaque-response-safelist check. They are not suited for usage elsewhere. + +

An opaque-response-safelisted MIME type is a JavaScript MIME type or a +MIME type whose essence is "text/css" or +"image/svg+xml". + +

An opaque-response-blocklisted MIME type is an HTML MIME type, +JSON MIME type, or XML MIME type. + +

An opaque-response-blocklisted-never-sniffed MIME type is a MIME type +whose essence is one of: + +

+ +

HTTP extensions

@@ -5237,19 +5529,23 @@ these steps:
  • Set response and internalResponse to the result of running HTTP-network-or-cache fetch given fetchParams. -

  • -

    If request's response tainting is "cors" and a - CORS check for request and response returns failure, then return a - network error. +

  • If request's response tainting is "opaque", + response's status is not a redirect status, and the + opaque-response-safelist check given request and response returns + false, then return a network error. -

    As the CORS check is not to be applied to responses whose - status is 304 or 407, or responses from a service worker for - that matter, it is applied here. +

  • If request's response tainting is "cors" and + the CORS check for request and response returns failure, then return + a network error.

  • If the TAO check for request and response returns failure, then set request's timing allow failed flag. +

    As the opaque-response-safelist check, CORS check, and + TAO check are not to be applied to responses whose status + is 304 or 407, or to responses from a service worker, they are applied here. +

  • If either request's response tainting or response's type is "opaque", and the @@ -9152,6 +9448,7 @@ Mohamed Zergaoui, Mohammed Zubair Ahmed, Moritz Kneilmann, Ms2ger, +Nathan Froyd, Nico Schlömer, Nicolás Peña Moreno, Nidhi Jaju, From cbe9f321c6443fb0751f871bdd57c4c7dac17c01 Mon Sep 17 00:00:00 2001 From: Sean Feng Date: Mon, 27 May 2024 13:41:26 -0400 Subject: [PATCH 2/5] Add the `validate a partial response` algorithm and the `extract content-range values` algorithm --- fetch.bs | 115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 114 insertions(+), 1 deletion(-) diff --git a/fetch.bs b/fetch.bs index 69706634c..013a4fc3c 100644 --- a/fetch.bs +++ b/fetch.bs @@ -3344,7 +3344,6 @@ and a response response, is to run these steps:

  • If response's status is 206 and validate a partial response given 0 and response returns invalid, then return false. -

  • Let bytes be the result of running obtain a copy of the first 1024 bytes of response given response. @@ -3392,6 +3391,81 @@ and a response response, is to run these steps:


    +
    +

    To extract content-range values, given a response response +run these steps:

    + +

      +
    1. If response’s header list does not contain `Content-Range`, then return failure. + +

    2. Let contentRangeValue be the value of the first header whose name is a + byte-case-insensitive match for `Content-Range` in response’s header list. + +

    3. If parsing contentRangeValue per single byte content-range fails, then return failure. + +

    4. Let firstBytePos be the portion of contentRangeValue named + first-byte-pos when parsed as single byte content-range, parsed as an integer. + +

    5. Let lastBytePos be the portion of contentRangeValue named + last-byte-pos when parsed as single byte content-range, parsed as an integer. + +

    6. Let completeLength be the portion of contentRangeValue named + complete-length when parsed as single byte content-range. + +

    7. If completeLength is "*", then set completeLength to null, otherwise + set completeLength to completeLength parsed as an integer. + +

    8. Return firstBytePos, lastBytePos, and completeLength. +

    + +

    Parsing as an integer infra/189 +

    + +
    + +
    +

    To validate a partial response, given an integer expectedRangeStart , a +response partialResponse, and an optional response previousResponse (default null), +run these steps:

    + +
      +
    1. Assert: partialResponse's status is `206`. + +

    2. Let responseFirstBytePos, responseLastBytePos, and responseCompleteLength be the + result of extracting content-range values from partialResponse. If this fails, then return invalid. + +

    3. If responseFirstBytePos does not equal expectedRangeStart, then return invalid. + +

    4. If previousResponse is not null, then: + +

        +
      1. For headerName of « `ETag`, `Last-Modified` »: + +

          +
        1. If previousResponse's header list contains headerName + and the combined value of headerName + in previousResponse's header list does not equal the + combined value of headerName in partialResponse's + header list, then return invalid. +
        + +
      2. If previousResponse's status is 206, then: + +

          +
        1. Let previousResponseFirstBytePos, previousResponseLastBytePos, + and previousResponseCompleteLength be the result of extracting content-range values + from previousResponse. If this fails, then return invalid. + +

        2. If previousResponseCompleteLength is not null, and + responseCompleteLength does not equal previousResponseCompleteLength, then return invalid. +

        +
      +
    5. Return valid. +
    +
    + +
    +

    To obtain a copy of the first 1024 bytes of response, given a response response, run these steps: @@ -3813,6 +3887,45 @@ response headers, the value `*` coun requests without credentials. For such requests there is no way to solely match a header name or method that is `*`. +

    ABNF for a single byte content-range: + +

    
    +"bytes=" first-byte-pos "-" last-byte-pos "/" complete-length
    +first-byte-pos = 1*DIGIT
    +last-byte-pos  = 1*DIGIT
    +complete-length = ( 1*DIGIT / "*" )
    +
    + +

    This is a subset of what RFC 7233 allows. + +

    + + The above as a railroad diagram: + +
    +  T: "bytes="
    +  Stack:
    +    Sequence:
    +      Comment: first-byte-pos
    +      OneOrMore:
    +        N: digit
    +      Comment: /first-byte-pos
    +      N: "/"
    +    Sequence:
    +      Comment: last-byte-pos
    +      OneOrMore:
    +        N: digit
    +      Comment: /last-byte-pos
    +      N: "/"
    +    Sequence:
    +      Comment: complete-length
    +      Choice:
    +        N: "*"
    +        OneOrMore:
    +          N: digit
    +      Comment: /complete-length
    +  
    +

    CORS protocol and credentials

    From 5818ed421fc5a3429a26e99ffc93ed4e2fa31d60 Mon Sep 17 00:00:00 2001 From: Sean Feng Date: Mon, 27 May 2024 13:43:07 -0400 Subject: [PATCH 3/5] Update the ORB spec to align with Firefox's implementation --- fetch.bs | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/fetch.bs b/fetch.bs index 013a4fc3c..d0be5272c 100644 --- a/fetch.bs +++ b/fetch.bs @@ -3333,8 +3333,7 @@ and a response response, is to run these steps:
  • If response's status is 206 and mimeType is an opaque-response-blocklisted MIME type, then return false. -

  • If nosniff is true and mimeType is an - opaque-response-blocklisted MIME type or its essence is +

  • If nosniff is true and mimeType's essence is "text/plain", then return false. @@ -3466,6 +3465,7 @@ run these steps:


    +

    To obtain a copy of the first 1024 bytes of response, given a response response, run these steps: @@ -3527,9 +3527,11 @@ run these steps:

  • Return first1024Bytes. +
    +

    To determine if response is JavaScript and not JSON given a response response, run these steps:

    @@ -3585,6 +3587,7 @@ run these steps:

  • Return false. +

    New MIME type sets

    @@ -3603,12 +3606,14 @@ run these steps:

    whose essence is one of:
      +
    • "application/dash+xml"
    • "application/gzip"
    • "application/msexcel"
    • "application/mspowerpoint"
    • "application/msword"
    • "application/msword-template"
    • "application/pdf" +
    • "application/vnd.apple.mpegurl"
    • "application/vnd.ces-quickpoint"
    • "application/vnd.ces-quicksheet"
    • "application/vnd.ces-quickword" @@ -3634,10 +3639,16 @@ whose essence is one of:
    • "application/x-protobuf"
    • "application/x-protobuffer"
    • "application/zip" +
    • "audio/aac" +
    • "audio/aacp" +
    • "audio/mpegurl" +
    • "audio/mpeg"
    • "multipart/byteranges"
    • "multipart/signed" +
    • "multipart/x-mixed-replace"
    • "text/event-stream"
    • "text/csv" +
    • "text/vtt"
    @@ -5643,9 +5654,15 @@ these steps: HTTP-network-or-cache fetch given fetchParams.
  • If request's response tainting is "opaque", - response's status is not a redirect status, and the - opaque-response-safelist check given request and response returns - false, then return a network error. + response's status is not a redirect status, then + +

      +
    1. If request's initiator type is "fetch", + then set internalResponse's body to null. + +

    2. Otherwise, if opaque-response-safelist check given request and response returns + false, then return a network error. +

  • If request's response tainting is "cors" and the CORS check for request and response returns failure, then return From 0908c4bb5c1a22b59e9fa4766db07ffa9fa44b7f Mon Sep 17 00:00:00 2001 From: Sean Feng Date: Wed, 12 Jun 2024 14:30:37 -0400 Subject: [PATCH 4/5] Fix the warnings --- fetch.bs | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/fetch.bs b/fetch.bs index d0be5272c..ef9f21a3f 100644 --- a/fetch.bs +++ b/fetch.bs @@ -52,6 +52,7 @@ urlPrefix:https://w3c.github.io/hr-time/#;spec:hr-time urlPrefix:https://tc39.es/ecma262/#;type:dfn;spec:ecma-262 url:realm;text:realm url:sec-list-and-record-specification-type;text:Record + url:current-realm;text:current realm url:sec-parsetext;text:ParseText url:prod-Script;text:Script url:script-record;text:Script Record @@ -3392,7 +3393,7 @@ and a response response, is to run these steps:

    To extract content-range values, given a response response -run these steps:

    +run these steps:

    1. If response’s header list does not contain `Content-Range`, then return failure. @@ -3497,7 +3498,7 @@ run these steps:

    2. Append chunkBytes to bytes.

    3. -

      If bytes's length is greater than 1024, then: +

      If bytes's length is greater than 1024, then:

      1. Truncate bytes from the end so that it only contains 1024 bytes. @@ -3517,7 +3518,7 @@ run these steps:

      2. Set response's body's stream to the result of response's body's stream - piped through transformStream. + piped through transformStream.

    4. Wait until first1024Bytes is non-null or response's @@ -3532,8 +3533,8 @@ run these steps:


      -

      To determine if response is JavaScript and not JSON given a response -response, run these steps:

      +

      To determine if response is JavaScript and not JSON, given a request request, +a response response, run these steps:

      1. Let responseBodyBytes be null. @@ -3550,7 +3551,7 @@ run these steps:

      2. Let processBodyError be this step: set responseBodyBytes to failure. -

      3. Fully read response's body given processBody +

      4. Fully read response's body given processBody and processBodyError.

      5. Wait for responseBodyBytes to be non-null. @@ -3569,7 +3570,7 @@ run these steps:

        given response's header list.
      6. -

        Let encoding be the result of legacy extracting an encoding given +

        Let encoding be the result of legacy extract an encoding given potentialMIMETypeForEncoding and request's no-cors JavaScript fallback encoding. From 00ef4cf812e2fec947e799f54e6eae29a0a13484 Mon Sep 17 00:00:00 2001 From: Sean Feng Date: Thu, 8 Aug 2024 15:32:56 -0400 Subject: [PATCH 5/5] Try to use dfn-type=http-header to fix the build error --- fetch.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fetch.bs b/fetch.bs index ef9f21a3f..f0c89059b 100644 --- a/fetch.bs +++ b/fetch.bs @@ -3305,7 +3305,7 @@ through TLS using ALPN. The protocol cannot be spoofed through HTTP requests in set of bytes, and ultimately falls back to a full parse due to unfortunate (lack of) design decisions in the early days of the web platform. As a result there are still quite a few responses whose secrets can end up being revealed to attackers. Web developers are strongly encouraged to use - the `Cross-Origin-Resource-Policy` response header to defend them. + the `Cross-Origin-Resource-Policy` response header to defend them.