Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow conversion errors do not indicate problematic dataframe column. #1621

Closed
karkinissan opened this issue Jul 21, 2023 · 2 comments · Fixed by #1836
Closed

Arrow conversion errors do not indicate problematic dataframe column. #1621

karkinissan opened this issue Jul 21, 2023 · 2 comments · Fixed by #1836
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@karkinissan
Copy link

karkinissan commented Jul 21, 2023

The current error when converting a dataframe to a pyarrow array in the dataframe_to_arrow method does not provide the user with the name of the column that the error encountered in.

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:704, in dataframe_to_parquet(dataframe, bq_schema, filepath, parquet_compression, parquet_use_compliant_nested_type)
    697 kwargs = (
    698     {"use_compliant_nested_type": parquet_use_compliant_nested_type}
    699     if _helpers.PYARROW_VERSIONS.use_compliant_nested_type
    700     else {}
    701 )
    703 bq_schema = schema._to_schema_fields(bq_schema)
--> 704 arrow_table = dataframe_to_arrow(dataframe, bq_schema)
    705 pyarrow.parquet.write_table(
    706     arrow_table,
    707     filepath,
    708     compression=parquet_compression,
    709     **kwargs,
    710 )

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:647, in dataframe_to_arrow(dataframe, bq_schema)
    644 for bq_field in bq_schema:
    645     arrow_names.append(bq_field.name)
    646     arrow_arrays.append(
--> 647         bq_to_arrow_array(get_column_or_index(dataframe, bq_field.name), bq_field)
    648     )
    649     arrow_fields.append(bq_to_arrow_field(bq_field, arrow_arrays[-1].type))
    651 if all((field is not None for field in arrow_fields)):

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:362, in bq_to_arrow_array(series, bq_field)
    360     return pyarrow.StructArray.from_pandas(series, type=arrow_type)
    361 try:
--> 362     return pyarrow.Array.from_pandas(series, type=arrow_type)
    363 except Exception as e: 
    364     _LOGGER.error(f"Error in column: {series.name}")

File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:1044, in pyarrow.lib.Array.from_pandas()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:316, in pyarrow.lib.array()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:83, in pyarrow.lib._ndarray_to_array()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\error.pxi:123, in pyarrow.lib.check_status()

ArrowTypeError: object of type <class 'str'> cannot be converted to int

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jul 21, 2023
karkinissan added a commit to karkinissan/python-bigquery that referenced this issue Jul 21, 2023
@meredithslota meredithslota added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Nov 6, 2023
@chalmerlowe
Copy link
Contributor

There is a PR #1836 being advanced to close this issue #1621
In addition, there is a duplicate issue #1822 that covers this problem.
Gonna close this one, so that we only have one open issue related to this problem.

@karkinissan
Copy link
Author

Awesome. Thanks for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
3 participants