-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Correctly treat requeues on reschedule sensors as resetting after each reschedule #51410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly treat requeues on reschedule sensors as resetting after each reschedule #51410
Conversation
The changes that introduced this bug were in #43520 |
Passing breeze tests now |
Co-authored-by: Jed Cunningham <[email protected]>
Included in what? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would like this one comment fixed first though https://github.com/apache/airflow/pull/51410/files#r2175543440
When we determine how many requeues a task has had. We want it to apply only to this "reschedule" not across all "reschedules" for this try. |
Backport failed to create: v3-0-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker a362101 v3-0-test This should apply the commit to the v3-0-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
Backport failed to create: v2-11-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker a362101 v2-11-test This should apply the commit to the v2-11-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
Reschedule sensors go into and out of running repeatedly within each try_number. Because the requeue logic allows 3 requeues per try_number, a reschedule sensor that experiences the need for requeues many hours apart can still fail. This PR changes that so that only requeues after the last time the task was running (if ever) are included.
closes #49971