Skip to content

Upgrade FAB to FAB 5 #50960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Upgrade FAB to FAB 5 #50960

wants to merge 1 commit into from

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented May 22, 2025

Switch to using flask-sqlalchemy db session management, and include Auth
Manager Provider Test Isolation and Reliability

  • Refactor test fixtures to always use Flask app contexts for DB and app
    operations.
  • Add a global pytest fixture to clear SQLAlchemy metadata before each
    test, reducing test flakiness.
  • Standardize session access and cleanup patterns across all tests.
  • Refactor user/role creation and deletion to ensure proper isolation.
  • Update test logic to use new app and auth manager creation utilities.
  • Remove or update redundant or fragile test code.
  • Normalize test constants and improve code readability.

Note: Clearing SQLAlchemy metadata and explicit app context management
are workarounds for test isolation issues. The root cause of metadata
and pool persistence between tests should be investigated further for a
more robust solution.

General Theme:

The main strategies are:

  • Ensuring proper use of Flask app contexts in tests
  • Cleaning up database state and metadata between tests
  • Using the correct SQLAlchemy session and interface patterns
  • Refactoring test fixtures for better isolation and reliability
  • Removing or updating code that is no longer needed or that could cause
    test flakiness

Key File-by-File Changes

  1. providers/fab/tests/unit/fab/conftest.py (New file)

What: Adds a global pytest fixture that clears SQLAlchemy metadata
before each test.

Why: This is to prevent metadata leakage between tests, which can cause
flaky or non-deterministic test results.

Uncertainty: Clearing metadata is a workaround; the root cause of
metadata persistence between tests may need deeper investigation.

  1. Test files for API endpoints, CLI commands, models, schemas, and views

What:

  • Many test fixtures now use with app.app_context(): to ensure all DB
    and app operations are performed within a Flask application context.
  • User and role creation/deletion is now always wrapped in an app
    context.
  • Some teardown logic is moved into the fixture's context manager to
    ensure cleanup happens after the test.
  • Session access is standardized to use appbuilder.session instead of
    get_session.
  • Some test constants (e.g., default times) are normalized (e.g.,
    removing timezone info).
  • Some test logic is refactored for clarity and reliability (e.g., using
    addfinalizer for logout in user endpoint tests).

Why:

Ensures that tests do not leak state or context, which can cause
failures when running tests in parallel or in different environments.
Using the correct session and context patterns is more robust and
future-proof.

Uncertainty:

While these changes improve test isolation, the need to clear metadata
and manage app contexts so explicitly suggests there may be deeper
issues with how the test environment is set up or torn down. Further
investigation into the test infrastructure may be warranted.

  1. test_fab_auth_manager.py and related files

What:

  • Switches to using the new create_app and get_auth_manager utilities
    for creating Flask apps and auth managers.
  • Updates test logic to use the new app and session patterns.
  • Fixes some test assertions to compare user IDs instead of user objects
    directly.
  • Moves some permission synchronization logic to after DAG creation and
    session commit.

Why:

These changes align the tests with the latest best practices and APIs in
Airflow and Flask AppBuilder. They also fix subtle bugs where tests
could pass or fail depending on object identity rather than value.

Uncertainty:

The need to manually commit and close sessions, and to synchronize
permissions, may indicate that the test setup/teardown is not fully
robust.

  1. test_security.py

What:

  • Removes some unused or redundant code (e.g., a test for DAG permission
    views).
  • Updates session and app context usage.

Why:

Cleans up the test suite and ensures all tests are using the correct patterns.

Uncertainty:

The removal of some tests may need to be reviewed to ensure no loss of coverage.

  1. Miscellaneous

What:

  • Minor formatting and import cleanups.
  • Some test parameters and constants are updated for consistency.

Why:

Improves code readability and maintainability.

Summary of Uncertainties and Next Steps

Clearing SQLAlchemy metadata and disposing pools:

These are workarounds for test isolation issues. The root cause (why
metadata and pools persist between tests) should be investigated
further. Ideally, the test infrastructure should handle this
automatically.

App context management:

The need for explicit app contexts in so many places may indicate that
the test setup could be improved to provide a more consistent
environment.

Session and teardown logic:

Manual session management and teardown in tests can be error-prone.
Consider centralizing this logic or using more robust fixtures.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:providers provider:fab provider:google Google (including GCP) related issues labels May 22, 2025
@potiuk potiuk force-pushed the fab-5 branch 4 times, most recently from c0620a6 to dce938a Compare May 29, 2025 19:10
@potiuk potiuk force-pushed the fab-5 branch 5 times, most recently from 54477dc to 5ce4949 Compare June 5, 2025 08:59
@potiuk potiuk changed the title Upgrade Flask-appbuilder to 5 Upgrade FAB to FAB 5 Jun 5, 2025
@@ -209,6 +209,17 @@ def _calculate_provider_deps_hash():

os.environ["AIRFLOW__CORE__ALLOWED_DESERIALIZATION_CLASSES"] = "airflow.*\nunit.*\n"
os.environ["AIRFLOW__CORE__PLUGINS_FOLDER"] = os.fspath(AIRFLOW_CORE_TESTS_PATH / "unit" / "plugins")

if os.environ.get("_AIRFLOW_SKIP_DB_TESTS", "false") == "true":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@potiuk -> self comment - this should be extracted to a separate commit as this and next one improve plugins manager behaviour in non-db tests.

init_appbuilder(app, enable_plugins=False)
init_plugins(app)
init_airflow_session_interface(app)
return app.appbuilder # type: ignore[attr-defined]


current_appbuilder: AirflowAppBuilder | None = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows nesting get_appliction_builder() -> when it called second time in the stack, it will return the already retrieved app builder and will not create new app context.

)
first_name: Mapped[str] = mapped_column(String(64), nullable=False)
last_name: Mapped[str] = mapped_column(String(64), nullable=False)
username: Mapped[str] = mapped_column(
Copy link
Member Author

@potiuk potiuk Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar -> why did you get rid of case-sensitiveness for username in FAB - sqlite needs it still - (We brought it back as it failed our tests, but this was a bit of a surprise.

init_wsgi_middleware(flask_app)
from flask_appbuilder.extensions import db

db.engine.dispose(close=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@potiuk @vincbeck @dpgaspar - I think we need to understand why we have to do it here. If we don't the number of connections established to the database grows infinitely when multiple apps are initialized in a single interpreter - generally the connections should be| ref-count or garbage-collector cleaned, but they are not.

from flask_appbuilder.extensions import db

db.init_app(app)
db.engine.dispose(close=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here:

@potiuk @vincbeck @dpgaspar - I think we need to understand why we have to do it here. If we don't the number of connections established to the database grows infinitely when multiple apps are initialized in a single interpreter - generally the connections should be| ref-count or garbage-collector cleaned, but they are not.

@@ -215,74 +210,3 @@ def __init__(self, datamodel):
filters.append(FilterIsNull)
if FilterIsNotNull not in filters:
filters.append(FilterIsNotNull)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar -> I guess we do not need that any more?

def clear_metadata():
from flask_appbuilder.extensions import db

db.metadata.clear()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not do this we get a lot of errors. My guess is that it comes from the fact that flask_appbuilder session is scoped to "application_context" and Airflow DB sessions are scoped to "local thread" and somehow they override each other's metadata.

example here: https://github.com/apache/airflow/actions/runs/15462899926/job/43529356760#step:6:5328

---------------------------- Captured stdout setup -----------------------------
Please make sure to build the frontend in static/ directory and restart the server
[2025-06-05T09:31:35.960+0000] {override.py:897} WARNING - No user yet created, use flask fab command to do it.
_ ERROR at setup of TestFabAuthManager.test_get_authorized_dag_ids[PUT-user_permissions5-expected_results5] _
providers/fab/tests/unit/fab/auth_manager/test_fab_auth_manager.py:112: in flask_app
    app = create_app(enable_plugins=False)
providers/fab/src/airflow/providers/fab/www/app.py:99: in create_app
    init_airflow_session_interface(flask_app)
providers/fab/src/airflow/providers/fab/www/extensions/init_session.py:50: in init_airflow_session_interface
    app.session_interface = AirflowDatabaseSessionInterface(
/usr/local/lib/python3.9/site-packages/flask_session/sqlalchemy/sqlalchemy.py:103: in __init__
    self.sql_session_model = create_session_model(
/usr/local/lib/python3.9/site-packages/flask_session/sqlalchemy/sqlalchemy.py:21: in create_session_model
    class Session(db.Model):
/usr/local/lib/python3.9/site-packages/flask_sqlalchemy/model.py:100: in __init__
    super().__init__(name, bases, d, **kwargs)
/usr/local/lib/python3.9/site-packages/flask_sqlalchemy/model.py:120: in __init__
    super().__init__(name, bases, d, **kwargs)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/decl_api.py:76: in __init__
    _as_declarative(reg, cls, dict_)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/decl_base.py:126: in _as_declarative
    return _MapperConfig.setup_mapping(registry, cls, dict_, None, {})
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/decl_base.py:183: in setup_mapping
    return cfg_cls(registry, cls_, dict_, table, mapper_kw)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/decl_base.py:331: in __init__
    self._setup_table(table)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/decl_base.py:854: in _setup_table
    table_cls(
/usr/local/lib/python3.9/site-packages/flask_sqlalchemy/model.py:147: in __table_cls__
    return sa.Table(*args, **kwargs)
<string>:2: in __new__
    ???
/usr/local/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py:375: in warned
    return fn(*args, **kwargs)
/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py:596: in __new__
    raise exc.InvalidRequestError(
E   sqlalchemy.exc.InvalidRequestError: Table 'session' is already defined for this MetaData instance.  Specify 'extend_existing=True' to redefine options and columns on an existing Table object.
---------------------------- Captured stdout setup -----------------------------
Please make sure to build the frontend in static/ directory and restart the server
[2025-06-05T09:31:36.525+0000] {override.py:897} WARNING - No user yet created, use flask fab command to do it.

permanent=permanent_cookie,
# Typically these would be configurable with Flask-Session,
# but we will set them explicitly instead as they don't make
# sense to have configurable in Airflow's use case
table="session",
key_prefix="",
use_signer=True,
cleanup_n_requests=5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 5? Is it something arbitrary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm . Not sure :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is in that the same constant in FAB ?

@vincbeck vincbeck force-pushed the fab-5 branch 4 times, most recently from d4699d4 to 74be7ec Compare June 6, 2025 13:45
Switch to using flask-sqlalchemy db session management, and include Auth
Manager Provider Test Isolation and Reliability

- Refactor test fixtures to always use Flask app contexts for DB and app
  operations.
- Add a global pytest fixture to clear SQLAlchemy metadata before each
  test, reducing test flakiness.
- Standardize session access and cleanup patterns across all tests.
- Refactor user/role creation and deletion to ensure proper isolation.
- Update test logic to use new app and auth manager creation utilities.
- Remove or update redundant or fragile test code.
- Normalize test constants and improve code readability.

Note: Clearing SQLAlchemy metadata and explicit app context management
are workarounds for test isolation issues. The root cause of metadata
and pool persistence between tests should be investigated further for a
more robust solution.

General Theme:

The main strategies are:

* Ensuring proper use of Flask app contexts in tests
* Cleaning up database state and metadata between tests
* Using the correct SQLAlchemy session and interface patterns
* Refactoring test fixtures for better isolation and reliability
* Removing or updating code that is no longer needed or that could cause
  test flakiness

Key File-by-File Changes

1. providers/fab/tests/unit/fab/conftest.py (New file)

What: Adds a global pytest fixture that clears SQLAlchemy metadata
before each test.

Why: This is to prevent metadata leakage between tests, which can cause
flaky or non-deterministic test results.

Uncertainty: Clearing metadata is a workaround; the root cause of
metadata persistence between tests may need deeper investigation.

2. Test files for API endpoints, CLI commands, models, schemas, and views

What:

* Many test fixtures now use with app.app_context(): to ensure all DB
  and app operations are performed within a Flask application context.
* User and role creation/deletion is now always wrapped in an app
  context.
* Some teardown logic is moved into the fixture's context manager to
  ensure cleanup happens after the test.
* Session access is standardized to use appbuilder.session instead of
  get_session.
* Some test constants (e.g., default times) are normalized (e.g.,
  removing timezone info).
* Some test logic is refactored for clarity and reliability (e.g., using
  addfinalizer for logout in user endpoint tests).

Why:

Ensures that tests do not leak state or context, which can cause
failures when running tests in parallel or in different environments.
Using the correct session and context patterns is more robust and
future-proof.

Uncertainty:

While these changes improve test isolation, the need to clear metadata
and manage app contexts so explicitly suggests there may be deeper
issues with how the test environment is set up or torn down. Further
investigation into the test infrastructure may be warranted.

3. test_fab_auth_manager.py and related files

What:

* Switches to using the new create_app and get_auth_manager utilities
  for creating Flask apps and auth managers.
* Updates test logic to use the new app and session patterns.
* Fixes some test assertions to compare user IDs instead of user objects
  directly.
* Moves some permission synchronization logic to after DAG creation and
  session commit.

Why:

These changes align the tests with the latest best practices and APIs in
Airflow and Flask AppBuilder.  They also fix subtle bugs where tests
could pass or fail depending on object identity rather than value.

Uncertainty:
The need to manually commit and close sessions, and to synchronize
permissions, may indicate that the test setup/teardown is not fully
robust.

4. test_security.py

What:

* Removes some unused or redundant code (e.g., a test for DAG permission
  views).
* Updates session and app context usage.

Why:

Cleans up the test suite and ensures all tests are using the correct patterns.

Uncertainty:
The removal of some tests may need to be reviewed to ensure no loss of coverage.

5. Miscellaneous

What:
* Minor formatting and import cleanups.
* Some test parameters and constants are updated for consistency.

Why:

Improves code readability and maintainability.

Summary of Uncertainties and Next Steps

* Clearing SQLAlchemy metadata and disposing pools:

These are workarounds for test isolation issues. The root cause (why
metadata and pools persist between tests) should be investigated
further. Ideally, the test infrastructure should handle this
automatically.

App context management:

The need for explicit app contexts in so many places may indicate that
the test setup could be improved to provide a more consistent
environment.

Session and teardown logic:

Manual session management and teardown in tests can be error-prone.
Consider centralizing this logic or using more robust fixtures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:providers provider:fab provider:google Google (including GCP) related issues
Projects
Status: Dependencies
Development

Successfully merging this pull request may close these issues.

2 participants