Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyarrow import error in 3.20.0 #1877

Closed
pbhuss opened this issue Mar 27, 2024 · 9 comments · Fixed by #1879
Closed

pyarrow import error in 3.20.0 #1877

pbhuss opened this issue Mar 27, 2024 · 9 comments · Fixed by #1879
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@pbhuss
Copy link

pbhuss commented Mar 27, 2024

Environment details

  • OS type and version: MacOS 14.4.1 (23E224)
  • Python version: 3.10
  • pip version: 24.0
  • google-cloud-bigquery version: 3.20.0

Steps to reproduce

Create a new virtualenv, install google-cloud-bigquery==3.20.0, attempt to import google.cloud.bigquery

$ virtualenv -p python3.10 venv
created virtual environment CPython3.10.14.final.0-64 in 322ms
  creator CPython3macOsBrew(dest=/Users/me/venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/Users/me/Library/Application Support/virtualenv)
    added seed packages: pip==24.0, setuptools==69.1.1, wheel==0.42.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
$ source venv/bin/activate
$ pip install -i https://pypi.python.org/simple google-cloud-bigquery==3.20.0
Looking in indexes: https://pypi.python.org/simple
Collecting google-cloud-bigquery==3.20.0
  Using cached google_cloud_bigquery-3.20.0-py2.py3-none-any.whl.metadata (8.9 kB)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1 (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached google_api_core-2.18.0-py3-none-any.whl.metadata (2.7 kB)
Collecting google-auth<3.0.0dev,>=2.14.1 (from google-cloud-bigquery==3.20.0)
  Using cached google_auth-2.29.0-py2.py3-none-any.whl.metadata (4.7 kB)
Collecting google-cloud-core<3.0.0dev,>=1.6.0 (from google-cloud-bigquery==3.20.0)
  Using cached google_cloud_core-2.4.1-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting google-resumable-media<3.0dev,>=0.6.0 (from google-cloud-bigquery==3.20.0)
  Using cached google_resumable_media-2.7.0-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting packaging>=20.0.0 (from google-cloud-bigquery==3.20.0)
  Using cached packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting python-dateutil<3.0dev,>=2.7.2 (from google-cloud-bigquery==3.20.0)
  Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting requests<3.0.0dev,>=2.21.0 (from google-cloud-bigquery==3.20.0)
  Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting googleapis-common-protos<2.0.dev0,>=1.56.2 (from google-api-core!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached googleapis_common_protos-1.63.0-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0.dev0,>=3.19.5 (from google-api-core!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached protobuf-4.25.3-cp37-abi3-macosx_10_9_universal2.whl.metadata (541 bytes)
Collecting proto-plus<2.0.0dev,>=1.22.3 (from google-api-core!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached proto_plus-1.23.0-py3-none-any.whl.metadata (2.2 kB)
Collecting grpcio<2.0dev,>=1.33.2 (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached grpcio-1.62.1-cp310-cp310-macosx_12_0_universal2.whl.metadata (4.0 kB)
Collecting grpcio-status<2.0.dev0,>=1.33.2 (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-cloud-bigquery==3.20.0)
  Using cached grpcio_status-1.62.1-py3-none-any.whl.metadata (1.3 kB)
Collecting cachetools<6.0,>=2.0.0 (from google-auth<3.0.0dev,>=2.14.1->google-cloud-bigquery==3.20.0)
  Using cached cachetools-5.3.3-py3-none-any.whl.metadata (5.3 kB)
Collecting pyasn1-modules>=0.2.1 (from google-auth<3.0.0dev,>=2.14.1->google-cloud-bigquery==3.20.0)
  Using cached pyasn1_modules-0.4.0-py3-none-any.whl.metadata (3.4 kB)
Collecting rsa<5,>=3.1.4 (from google-auth<3.0.0dev,>=2.14.1->google-cloud-bigquery==3.20.0)
  Using cached rsa-4.9-py3-none-any.whl.metadata (4.2 kB)
Collecting google-crc32c<2.0dev,>=1.0 (from google-resumable-media<3.0dev,>=0.6.0->google-cloud-bigquery==3.20.0)
  Using cached google_crc32c-1.5.0-cp310-cp310-macosx_10_9_universal2.whl.metadata (2.3 kB)
Collecting six>=1.5 (from python-dateutil<3.0dev,>=2.7.2->google-cloud-bigquery==3.20.0)
  Using cached six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Collecting charset-normalizer<4,>=2 (from requests<3.0.0dev,>=2.21.0->google-cloud-bigquery==3.20.0)
  Using cached charset_normalizer-3.3.2-cp310-cp310-macosx_11_0_arm64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests<3.0.0dev,>=2.21.0->google-cloud-bigquery==3.20.0)
  Using cached idna-3.6-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests<3.0.0dev,>=2.21.0->google-cloud-bigquery==3.20.0)
  Using cached urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests<3.0.0dev,>=2.21.0->google-cloud-bigquery==3.20.0)
  Using cached certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting pyasn1<0.7.0,>=0.4.6 (from pyasn1-modules>=0.2.1->google-auth<3.0.0dev,>=2.14.1->google-cloud-bigquery==3.20.0)
  Using cached pyasn1-0.6.0-py2.py3-none-any.whl.metadata (8.3 kB)
Using cached google_cloud_bigquery-3.20.0-py2.py3-none-any.whl (233 kB)
Using cached google_api_core-2.18.0-py3-none-any.whl (138 kB)
Using cached google_auth-2.29.0-py2.py3-none-any.whl (189 kB)
Using cached google_cloud_core-2.4.1-py2.py3-none-any.whl (29 kB)
Using cached google_resumable_media-2.7.0-py2.py3-none-any.whl (80 kB)
Using cached packaging-24.0-py3-none-any.whl (53 kB)
Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Using cached cachetools-5.3.3-py3-none-any.whl (9.3 kB)
Using cached certifi-2024.2.2-py3-none-any.whl (163 kB)
Using cached charset_normalizer-3.3.2-cp310-cp310-macosx_11_0_arm64.whl (120 kB)
Using cached google_crc32c-1.5.0-cp310-cp310-macosx_10_9_universal2.whl (32 kB)
Using cached googleapis_common_protos-1.63.0-py2.py3-none-any.whl (229 kB)
Using cached grpcio-1.62.1-cp310-cp310-macosx_12_0_universal2.whl (10.0 MB)
Using cached grpcio_status-1.62.1-py3-none-any.whl (14 kB)
Using cached idna-3.6-py3-none-any.whl (61 kB)
Using cached proto_plus-1.23.0-py3-none-any.whl (48 kB)
Using cached protobuf-4.25.3-cp37-abi3-macosx_10_9_universal2.whl (394 kB)
Using cached pyasn1_modules-0.4.0-py3-none-any.whl (181 kB)
Using cached rsa-4.9-py3-none-any.whl (34 kB)
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Using cached urllib3-2.2.1-py3-none-any.whl (121 kB)
Using cached pyasn1-0.6.0-py2.py3-none-any.whl (85 kB)
Installing collected packages: urllib3, six, pyasn1, protobuf, packaging, idna, grpcio, google-crc32c, charset-normalizer, certifi, cachetools, rsa, requests, python-dateutil, pyasn1-modules, proto-plus, googleapis-common-protos, google-resumable-media, grpcio-status, google-auth, google-api-core, google-cloud-core, google-cloud-bigquery
Successfully installed cachetools-5.3.3 certifi-2024.2.2 charset-normalizer-3.3.2 google-api-core-2.18.0 google-auth-2.29.0 google-cloud-bigquery-3.20.0 google-cloud-core-2.4.1 google-crc32c-1.5.0 google-resumable-media-2.7.0 googleapis-common-protos-1.63.0 grpcio-1.62.1 grpcio-status-1.62.1 idna-3.6 packaging-24.0 proto-plus-1.23.0 protobuf-4.25.3 pyasn1-0.6.0 pyasn1-modules-0.4.0 python-dateutil-2.9.0.post0 requests-2.31.0 rsa-4.9 six-1.16.0 urllib3-2.2.1
$ python -c "import google.cloud.bigquery"
Traceback (most recent call last):
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/_versions_helpers.py", line 75, in try_import
    import pyarrow
ModuleNotFoundError: No module named 'pyarrow'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/__init__.py", line 35, in <module>
    from google.cloud.bigquery.client import Client
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/client.py", line 69, in <module>
    from google.cloud.bigquery import _job_helpers
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/_job_helpers.py", line 47, in <module>
    from google.cloud.bigquery import job
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/job/__init__.py", line 27, in <module>
    from google.cloud.bigquery.job.copy_ import CopyJob
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/job/copy_.py", line 22, in <module>
    from google.cloud.bigquery.table import TableReference
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/table.py", line 62, in <module>
    from google.cloud.bigquery import _pandas_helpers
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 52, in <module>
    pyarrow = _versions_helpers.PYARROW_VERSIONS.try_import(raise_if_error=True)
  File "/Users/me/venv/lib/python3.10/site-packages/google/cloud/bigquery/_versions_helpers.py", line 78, in try_import
    raise exceptions.LegacyPyarrowError(
google.cloud.bigquery.exceptions.LegacyPyarrowError: pyarrow package not found. Install pyarrow version >= 3.0.0.
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Mar 27, 2024
@shollyman shollyman added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Mar 27, 2024
@shollyman
Copy link
Contributor

Thanks for flagging this. I've tagged in folks to help take a look at this, but likely there won't be any substantial updates to this until tomorrow.

@wilstoff
Copy link

wilstoff commented Mar 28, 2024

Also ran into this, i would highly recommend pulling the 3.20 release version from pypi till the true fix can be implemented.

@nickbarton-wiq
Copy link

Also ran into this issue when using dbt-bigquery

@mikealfare
Copy link

We also ran into this. Updating the dependency to google-cloud-bigquery[pandas] resolved it in our case at least, because pyarrow was added to the pandas extra in 3.20.0. It appears this may not be an extra dependency, but rather a required dependency.

@manuellazzari-cargoone
Copy link

Also running in the same issue, pinned our requirements.txt to avoid this release

google-cloud-bigquery!=3.20.0

hopefully, they'll fix this quickly enough.

@vchetyrkine
Copy link

vchetyrkine commented Mar 28, 2024

I believe issue stems from

pyarrow = _versions_helpers.PYARROW_VERSIONS.try_import(raise_if_error=True)

https://github.com/googleapis/python-bigquery/blob/v3.20.0/google/cloud/bigquery/_pandas_helpers.py#L52

introduced here 0ac6e9b

@shollyman
Copy link
Contributor

Quick update: Overnight we were able to embargo the release on pypi, and we're now working through addressing the issues introduced.

@chalmerlowe
Copy link
Contributor

PR #1879 is under review. Thanks to @tswast for jumping on this to generate a potential fix.

@tswast
Copy link
Contributor

tswast commented Apr 1, 2024

3.20.1 has been released with this fix. #1880

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants