jnp.mean(x, dtype=bfloat16) is not respected #26365

rryan · 2025-02-06T18:33:22Z

Description

Regarding accumulation dtype for np.mean, the numpy docs say:

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

By default, float16 results are computed using float32 intermediates for extra precision.

This suggests that for a bfloat16/float16 input, the default value for dtype is float32, but the user can request a different precision.

In #17792, we fixed the default upcasting, but in my testing on TPU, jnp.mean(x, dtype=jnp.bfloat16) still casts to fp32 so the dtype parameter does not seem to allow the user to override it.

I reprod this under pjit with simply:

x = jnp.zeros((2, 3, 5), dtype=jnp.bfloat16)
y = jnp.mean(x, axis=-1, keepdims=True, dtype=x.dtype)

and I observe the following:

System info (python version, jaxlib version, accelerator, etc.)

(internal to google, running around cl/723930299 on borg)

The text was updated successfully, but these errors were encountered:

jakevdp · 2025-02-06T18:54:18Z

Thanks for the report!

jakevdp · 2025-02-07T18:47:02Z

I dug in a bit to understand what NumPy does here, and it looks like NumPy upcasts float16 regardless of whether the dtype is specified; for example:

In [1]: import numpy as np

In [2]: np.random.seed(0)

In [3]: x = (100 * np.random.randn(10000)).astype('float16')

In [4]: x.sum()
Out[4]: np.float16(-18430.0)

In [5]: x.sum(dtype='float16')  # same result when specifying dtype=float16
Out[5]: np.float16(-18430.0)

In [6]: x.astype('float32').sum().astype('float16')  # same result when upcasting to float32 for the sum
Out[6]: np.float16(-18430.0)

In [6]: np.cumsum(x)[-1]  # cumsum operates in float16 only, and has different rounding
Out[6]: np.float16(-18160.0)

So if jax.numpy functions are to stay true to the semantics of the NumPy functions they're implementing, we need to upcast float16 values regardless of whether the user specifies the dtype.

That said, we do need a way to do what the original request asked for, namely allow the user to perform the reduction without a cast if they wish. I can think of a few options:

Decide to diverge from NumPy, so that x.sum(dtype='float16') lets the user specify that accumulation should happen in float16.
Expose lower-level summation APIs via jax.lax so that primitives like reduce_sum_p can be applied directly
Expose the upcast_f16_for_computation flag to the user-level jax.numpy APIs, so that users can configure this behavior.

All else being equal I think (2) may be the nicest option, because the fundamental request here is to be able to have more control over which primitive operations are being used to implement a reduction, and that would give you maximum control.

What do you think?

jakevdp · 2025-02-07T20:52:41Z

Actually, I'm kind of leaning toward doing both (1) and (2) here

pearu · 2025-02-07T21:16:26Z

FWIW, Python array API v2023.12 says: "If the data type (either specified or resolved) differs from the data type of x, the input array should be cast to the specified data type before computing the sum (rationale: the dtype keyword argument is intended to help prevent overflows)." that corresponds to (1).

However, notice that the mean function as specified in Python array API does not have dtype argument.

rryan added the bug Something isn't working label Feb 6, 2025

jakevdp self-assigned this Feb 6, 2025

jakevdp linked a pull request Feb 7, 2025 that will close this issue

jax.numpy reductions: avoid upcast of f16 when dtype is specified by user #26403

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jnp.mean(x, dtype=bfloat16) is not respected #26365

jnp.mean(x, dtype=bfloat16) is not respected #26365

rryan commented Feb 6, 2025

jakevdp commented Feb 6, 2025

jakevdp commented Feb 7, 2025 •

edited

Loading

jakevdp commented Feb 7, 2025

pearu commented Feb 7, 2025

jnp.mean(x, dtype=bfloat16) is not respected #26365

jnp.mean(x, dtype=bfloat16) is not respected #26365

Comments

rryan commented Feb 6, 2025

Description

System info (python version, jaxlib version, accelerator, etc.)

jakevdp commented Feb 6, 2025

jakevdp commented Feb 7, 2025 • edited Loading

jakevdp commented Feb 7, 2025

pearu commented Feb 7, 2025

jakevdp commented Feb 7, 2025 •

edited

Loading