Fix Certain DAG import errors ("role does not exist") don't persist in Airflow #51511
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
As described in #49651, when the access control for a DAG is set to an non-exist role, the DAG import error show up in Airflow UI for a while and then disappear. The update is to fix this issue, and let the import error persist in the metadata DB until the DAG is updated with a correct access control setting.
Close #49651
What is the issue
airflow/airflow-core/src/airflow/dag_processing/collection.py
Line 177 in 56fbe90
When the DAG's access control is set to a non-exist role, the following process will raise an Exception "Failed to write serialized DAG dag_id=...". So, how this exception is triggered?
dag_was_updated
will beTrue
when the first timeSerializedDagModel.write_dag
write the serialized DAG to the database.dag_was_updated
isTrue
,_sync_dag_perms
will be triggered to sync DAG specific permissions. At the moment, it detects that the role doesn't exist, and raise an error, resulting in the exception.From my understanding, this sync process will run for every
MIN_SERIALIZED_DAG_UPDATE_INTERVAL
. So, what happen in the second run.dag_was_updated
will beFalse
since the DAG code is not updated._sync_dag_perms
will NOT BE TRIGGERED even though in the access control is set incorrectly in the DAG code.What is the fix
In the current state,
_sync_dag_perms
runs only when the DAG is updated (i.e.,dag_was_updated
isTrue
). This can be more performant because it doesn't run for all the DAGs. However, it cannot properly handle the sync for permissions. Therefore, the current fix is to make_sync_dag_perms
run for all the DAGs during the DAG sync process. I understand it might not be an ideal fix, but I wasn't able to find a better solution due to my limited understanding on the code. I would really appreciate if anyone could suggest some ideas to further improve it.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in airflow-core/newsfragments.