Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLA 2.7 releases don't work properly with the upstream torch 2.7 #8626

Open
hosseinsarshar opened this issue Jan 26, 2025 · 7 comments
Open
Assignees

Comments

@hosseinsarshar
Copy link
Contributor

hosseinsarshar commented Jan 26, 2025

🐛 Bug

I tested the xla 2.7 nightly builds with the upstream torch 2.7 with many date combinations and all resulted in faulty execution - for example:

for this install:

pip install -U --pre torch==2.7.0.dev20250124+cpu --index-url https://download.pytorch.org/whl/nightly/cpu

pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev20250124-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html

pip install -U torch_xla[pallas] -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

I get this behaviour:

$python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hosseins/miniconda3/envs/test-new-nightly/lib/python3.10/site-packages/torch_xla/__init__.py", line 20, in <module>
    import _XLAC
ImportError: /home/hosseins/miniconda3/envs/test-new-nightly/lib/python3.10/site-packages/_XLAC.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5torch4lazy13MetricFnValueEd

But once, I downgrade the upstream torch to 2.6.0 dev20241216 (it's the highest that worked) - xla works as expected:

pip install -U --pre torch==2.6.0.dev20241216+cpu --index-url https://download.pytorch.org/whl/nightly/cpu

pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev20250119-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html


pip install -U torch_xla[pallas] -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

I get it fixed:

$ python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'xla'
>>> import torch_xla
WARNING:root:libtpu.so and TPU device found. Setting PJRT_DEVICE=TPU.
>>> torch_xla.devices()
[device(type='xla', index=0), device(type='xla', index=1), device(type='xla', index=2), device(type='xla', index=3), device(type='xla', index=4), device(type='xla', index=5), device(type='xla', index=6), device(type='xla', index=7)]
@hosseinsarshar hosseinsarshar changed the title XLA 2.7 releases don't work properly with upstream torch 2.7 XLA 2.7 releases don't work properly with the upstream torch 2.7 Jan 26, 2025
@bhavya01
Copy link
Collaborator

We recently updated the readme with newer instructions for nightly which should fix this issue.

pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html

# Optional: if you're using custom kernels, install pallas dependencies
pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

Upstream pytorch changed some environment variables while building their wheels which caused this issue.

@hosseinsarshar
Copy link
Contributor Author

Thanks @bhavya01 - I tried the new instructions as well - but the issue still remains - here is a simple test:

$ conda create -n new-xla python=3.10
Retrieving notices: ...working... done
Channels:
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/hosseins/miniconda3/envs/new-xla

  added / updated specs:
    - python=3.10


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main 
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu 
  bzip2              pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_6 
  ca-certificates    pkgs/main/linux-64::ca-certificates-2024.12.31-h06a4308_0 
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.40-h12ee557_0 
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_1 
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1 
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1 
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1 
  libuuid            pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0 
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0 
  openssl            pkgs/main/linux-64::openssl-3.0.15-h5eee18b_0 
  pip                pkgs/main/linux-64::pip-24.2-py310h06a4308_0 
  python             pkgs/main/linux-64::python-3.10.16-he870216_1 
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0 
  setuptools         pkgs/main/linux-64::setuptools-75.1.0-py310h06a4308_0 
  sqlite             pkgs/main/linux-64::sqlite-3.45.3-h5eee18b_0 
  tk                 pkgs/main/linux-64::tk-8.6.14-h39e8969_0 
  tzdata             pkgs/main/noarch::tzdata-2025a-h04d1e81_0 
  wheel              pkgs/main/linux-64::wheel-0.44.0-py310h06a4308_0 
  xz                 pkgs/main/linux-64::xz-5.4.6-h5eee18b_1 
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_1 


Proceed ([y]/n)? y


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate new-xla
#
# To deactivate an active environment, use
#
#     $ conda deactivate

$ conda activate new-xla
(new-xla) $ pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250127%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (26 kB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250127%2Bcpu-cp310-cp310-linux_x86_64.whl.metadata (6.2 kB)
Collecting filelock (from torch)
  Using cached https://download.pytorch.org/whl/nightly/filelock-3.16.1-py3-none-any.whl (16 kB)
Collecting typing-extensions>=4.10.0 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting sympy==1.13.1 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/sympy-1.13.1-py3-none-any.whl (6.2 MB)
Collecting networkx (from torch)
  Using cached https://download.pytorch.org/whl/nightly/networkx-3.4.2-py3-none-any.whl (1.7 MB)
Collecting jinja2 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/jinja2-3.1.4-py3-none-any.whl (133 kB)
Collecting fsspec (from torch)
  Using cached https://download.pytorch.org/whl/nightly/fsspec-2024.10.0-py3-none-any.whl (179 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy==1.13.1->torch)
  Using cached https://download.pytorch.org/whl/nightly/mpmath-1.3.0-py3-none-any.whl (536 kB)
Collecting numpy (from torchvision)
  Using cached https://download.pytorch.org/whl/nightly/numpy-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 MB)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
  Using cached https://download.pytorch.org/whl/nightly/pillow-11.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.4 MB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
  Using cached https://download.pytorch.org/whl/nightly/MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250127%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl (175.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 175.4/175.4 MB 142.6 MB/s eta 0:00:00
Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250127%2Bcpu-cp310-cp310-linux_x86_64.whl (1.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 133.2 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch, torchvision
Successfully installed MarkupSafe-2.1.5 filelock-3.16.1 fsspec-2024.10.0 jinja2-3.1.4 mpmath-1.3.0 networkx-3.4.2 numpy-2.1.2 pillow-11.0.0 sympy-1.13.1 torch-2.7.0.dev20250127+cpu torchvision-0.22.0.dev20250127+cpu typing-extensions-4.12.2
(new-xla) $ pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html
Looking in links: https://storage.googleapis.com/libtpu-releases/index.html, https://storage.googleapis.com/libtpu-wheels/index.html
Collecting torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (from torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (94.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.6/94.6 MB 146.6 MB/s eta 0:00:00
Collecting absl-py>=1.0.0 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Requirement already satisfied: numpy in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl) (2.1.2)
Collecting pyyaml (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting requests (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting libtpu==0.0.8.dev20250113 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu/libtpu-0.0.8.dev20250113%2Bnightly-py3-none-linux_x86_64.whl (132.5 MB)
Collecting tpu-info (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached tpu_info-0.2.0-py3-none-any.whl.metadata (3.7 kB)
Collecting libtpu-nightly==0.1.dev20241010+nightly.cleanup (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu-nightly/libtpu_nightly-0.1.dev20241010%2Bnightly.cleanup-py3-none-any.whl (1.3 kB)
Collecting charset-normalizer<4,>=2 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (35 kB)
Collecting idna<4,>=2.5 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached urllib3-2.3.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached certifi-2024.12.14-py3-none-any.whl.metadata (2.3 kB)
Collecting grpcio>=1.65.5 (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached grpcio-1.70.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.9 kB)
Collecting protobuf (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached protobuf-5.29.3-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Collecting rich (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached rich-13.9.4-py3-none-any.whl.metadata (18 kB)
Collecting markdown-it-py>=2.2.0 (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Collecting pygments<3.0.0,>=2.13.0 (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached pygments-2.19.1-py3-none-any.whl.metadata (2.5 kB)
Requirement already satisfied: typing-extensions<5.0,>=4.0.0 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl) (4.12.2)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Using cached absl_py-2.1.0-py3-none-any.whl (133 kB)
Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Using cached tpu_info-0.2.0-py3-none-any.whl (14 kB)
Using cached certifi-2024.12.14-py3-none-any.whl (164 kB)
Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (146 kB)
Using cached grpcio-1.70.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached urllib3-2.3.0-py3-none-any.whl (128 kB)
Using cached protobuf-5.29.3-cp38-abi3-manylinux2014_x86_64.whl (319 kB)
Using cached rich-13.9.4-py3-none-any.whl (242 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached pygments-2.19.1-py3-none-any.whl (1.2 MB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: libtpu-nightly, libtpu, urllib3, pyyaml, pygments, protobuf, mdurl, idna, grpcio, charset-normalizer, certifi, absl-py, requests, markdown-it-py, torch_xla, rich, tpu-info
Successfully installed absl-py-2.1.0 certifi-2024.12.14 charset-normalizer-3.4.1 grpcio-1.70.0 idna-3.10 libtpu-0.0.8.dev20250113+nightly libtpu-nightly-0.1.dev20241010+nightly.cleanup markdown-it-py-3.0.0 mdurl-0.1.2 protobuf-5.29.3 pygments-2.19.1 pyyaml-6.0.2 requests-2.32.3 rich-13.9.4 torch_xla-2.7.0+git8b24140 tpu-info-0.2.0 urllib3-2.3.0
(new-xla) $ pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Looking in links: https://storage.googleapis.com/jax-releases/jax_nightly_releases.html, https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Requirement already satisfied: torch_xla[pallas] in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (2.7.0+git8b24140)
Requirement already satisfied: absl-py>=1.0.0 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.1.0)
Requirement already satisfied: numpy in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.1.2)
Requirement already satisfied: pyyaml in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (6.0.2)
Requirement already satisfied: requests in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.32.3)
Collecting jaxlib==0.4.39.dev20250113 (from torch_xla[pallas])
  Using cached https://storage.googleapis.com/jax-releases/nightly/nocuda/jaxlib-0.4.39.dev20250113-cp310-cp310-manylinux2014_x86_64.whl (101.5 MB)
Collecting jax==0.4.39.dev20250113 (from torch_xla[pallas])
  Using cached https://storage.googleapis.com/jax-releases/nightly/jax/jax-0.4.39.dev20250113-py3-none-any.whl (2.3 MB)
Collecting ml_dtypes>=0.4.0 (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (21 kB)
Collecting opt_einsum (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached opt_einsum-3.4.0-py3-none-any.whl.metadata (6.3 kB)
Collecting scipy>=1.11.1 (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached scipy-1.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (2024.12.14)
Using cached ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.7 MB)
Using cached scipy-1.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (40.6 MB)
Using cached opt_einsum-3.4.0-py3-none-any.whl (71 kB)
Installing collected packages: scipy, opt_einsum, ml_dtypes, jaxlib, jax
Successfully installed jax-0.4.39.dev20250113 jaxlib-0.4.39.dev20250113 ml_dtypes-0.5.1 opt_einsum-3.4.0 scipy-1.15.1
(new-xla) $ python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hosseins/miniconda3/envs/new-xla/lib/python3.10/site-packages/torch_xla/__init__.py", line 20, in <module>
    import _XLAC
ImportError: /home/hosseins/miniconda3/envs/new-xla/lib/python3.10/site-packages/_XLAC.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5torch6dynamo8autograd18get_input_metadataERKSt6vectorINS_8autograd4EdgeESaIS4_EE
>>> 

@miladm
Copy link
Collaborator

miladm commented Jan 27, 2025

Ideally, this build combination should work as expected @hosseinsarshar.

I see you were able to unblock this issue by matching an older torch wheel with the latest torch_xla. I am reopening it so we can further investigate further. cc @lsy323 @ysiraichi

@miladm miladm reopened this Jan 27, 2025
@hosseinsarshar
Copy link
Contributor Author

hosseinsarshar commented Feb 3, 2025

@miladm / @lsy323 wonder if you had a chance to review this issue - I'm asking as the current pinned torch version (2.6.0.dev20241216+cpu) is out of date today. Thanks

@ysiraichi
Copy link
Collaborator

I couldn't reproduce this issue. I ran the same commands (using micromamba, instead), and I was able to import torch_xla without problems.

@ysiraichi
Copy link
Collaborator

~
➜ micromamba create -n test python=3.10
conda-forge/noarch                                  18.9MB @  19.5MB/s  1.0s
conda-forge/linux-64                                41.7MB @  15.0MB/s  2.8s


Transaction
  ...
  Updating specs:

   - python=3.10


  Package               Version  Build               Channel           Size
─────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────

  + _libgcc_mutex           0.1  conda_forge         conda-forge     Cached
  + _openmp_mutex           4.5  2_gnu               conda-forge     Cached
  + bzip2                 1.0.8  h4bc722e_7          conda-forge     Cached
  + ca-certificates   2025.1.31  hbcca054_0          conda-forge     Cached
  + ld_impl_linux-64       2.43  h712a8e2_2          conda-forge     Cached
  + libffi                3.4.2  h7f98852_5          conda-forge     Cached
  + libgcc               14.2.0  h77fa898_1          conda-forge     Cached
  + libgcc-ng            14.2.0  h69a702a_1          conda-forge     Cached
  + libgomp              14.2.0  h77fa898_1          conda-forge     Cached
  + liblzma               5.6.4  hb9d3cd8_0          conda-forge     Cached
  + libnsl                2.0.1  hd590300_0          conda-forge     Cached
  + libsqlite            3.48.0  hee588c1_1          conda-forge     Cached
  + libuuid              2.38.1  h0b41bf4_0          conda-forge     Cached
  + libxcrypt            4.4.36  hd590300_1          conda-forge     Cached
  + libzlib               1.3.1  hb9d3cd8_2          conda-forge     Cached
  + ncurses                 6.5  h2d0b736_3          conda-forge     Cached
  + openssl               3.4.0  h7b32b05_1          conda-forge     Cached
  + pip                    25.0  pyh8b19718_0        conda-forge     Cached
  + python              3.10.16  he725a3c_1_cpython  conda-forge     Cached
  + readline                8.2  h8228510_1          conda-forge     Cached
  + setuptools           75.8.0  pyhff2d567_0        conda-forge     Cached
  + tk                   8.6.13  noxft_h4845f30_101  conda-forge     Cached
  + tzdata                2025a  h78e105d_0          conda-forge     Cached
  + wheel                0.45.1  pyhd8ed1ab_1        conda-forge     Cached

  Summary:

  Install: 24 packages

  Total download: 0 B

─────────────────────────────────────────────────────────────────────────────


Confirm changes: [Y/n]
...

~
➜ micromamba activate test

~ via 🅒 test
➜ pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250206%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (26 kB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250206%2Bcpu-cp310-cp310-linux_x86_64.whl.metadata (6.2 kB)
...

~ via 🅒 test
➜ pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html
Looking in links: https://storage.googleapis.com/libtpu-releases/index.html, https://storage.googleapis.com/libtpu-wheels/index.html
Collecting torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (from torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (95.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 95.0/95.0 MB 19.7 MB/s eta 0:00:00
Collecting libtpu==0.0.9.dev20250131 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu/libtpu-0.0.9.dev20250131%2Bnightly-py3-none-linux_x86_64.whl (133.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 MB 28.7 MB/s eta 0:00:00
Collecting tpu-info (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading tpu_info-0.2.0-py3-none-any.whl.metadata (3.7 kB)
Collecting libtpu-nightly==0.1.dev20241010+nightly.cleanup (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu-nightly/libtpu_nightly-0.1.dev20241010%2Bnightly.cleanup-py3-none-any.whl (1.3 kB)
...

~ via 🅒 test
➜ pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Looking in links: https://storage.googleapis.com/jax-releases/jax_nightly_releases.html, https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Collecting jaxlib==0.5.1.dev20250131 (from torch_xla[pallas])
  Downloading https://storage.googleapis.com/jax-releases/nightly/nocuda/jaxlib-0.5.1.dev20250131-cp310-cp310-manylinux2014_x86_64.whl (103.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.0/103.0 MB 22.5 MB/s eta 0:00:00
Collecting jax==0.5.1.dev20250131 (from torch_xla[pallas])
  Downloading https://storage.googleapis.com/jax-releases/nightly/jax/jax-0.5.1.dev20250131-py3-none-any.whl (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 5.0 MB/s eta 0:00:00
Collecting ml_dtypes>=0.4.0 (from jax==0.5.1.dev20250131->torch_xla[pallas])
  Downloading ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (21 kB)
...

~ via 🅒 test took 15s509ms
➜ python
Python 3.10.16 | packaged by conda-forge | (main, Dec  5 2024, 14:16:10) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
WARNING:root:Defaulting to PJRT_DEVICE=CPU

@hosseinsarshar
Copy link
Contributor Author

thanks @ysiraichi for testing this -

Checking your logs - I see WARNING:root:Defaulting to PJRT_DEVICE=CPU which shows that it can't find the TPU device by default which shouldn't be the case. I also mentioned it in my first comment - if you pass PJRT_DEVICE=TPU, it most probably break - if it doesn't, I expect you can't get the list of devices: please try to list them by torch_xla.devices().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants