Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid unnecessary API call in QueryJob.result() when job is already finished #1900

Merged
merged 14 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Prev Previous commit
Next Next commit
Apply suggestions from code review
Co-authored-by: Chalmer Lowe <[email protected]>
  • Loading branch information
tswast and chalmerlowe committed Apr 17, 2024
commit 5149c255cad0d566b93d95c4f8e3dabb9394fe96
2 changes: 1 addition & 1 deletion google/cloud/bigquery/job/query.py
Expand Up @@ -1528,7 +1528,7 @@ def is_job_done():
if restart_query_job:
restart_query_job = False

# The orinal job is failed. Create a new one.
# The original job has failed. Create a new one.
#
# Note that we won't get here if retry_do_query is
# None, because we won't use a retry.
Expand Down
4 changes: 2 additions & 2 deletions google/cloud/bigquery/retry.py
Expand Up @@ -36,7 +36,7 @@

_DEFAULT_RETRY_DEADLINE = 10.0 * 60.0 # 10 minutes

# Allow for a few retries after the API request times out. This relevant for
# Allow for a few retries after the API request times out. This is relevant for
# rateLimitExceeded errors, which can be raised either by the Google load
# balancer or the BigQuery job server.
_DEFAULT_JOB_DEADLINE = 4.0 * _DEFAULT_RETRY_DEADLINE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of using 4.0 here?
Can we get a comment indicating why 4.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 2.0 * (2.0 * _DEFAULT_RETRY_DEADLINE) and added some explanation both here and in QueryJob.result().

Note: This still only gets us 1 query retry in the face of the problematic ambiguous error codes from jobs.getQueryResults() but that's better than the nothing that we were actually getting before in some cases. I don't feel comfortable bumping this much further, though maybe 3.0 * 2.0 * _DEFAULT_RETRY_DEADLINE would be slightly less arbitrary at 1 hour?

Expand Down Expand Up @@ -84,7 +84,7 @@ def _should_retry(exc):
def _job_should_retry(exc):
# Sometimes we have ambiguous errors, such as 'backendError' which could
# be due to an API problem or a job problem. For these, make sure we retry
# our is_job_done function.
# our is_job_done() function.
#
# Note: This won't restart the job unless we know for sure it's because of
# the job status and set restart_query_job = True in that loop. This means
Expand Down